How can I query a Word docx in an ASP.NET app? - asp.net

I would like to upload a Word 2007 or greater docx file to my web server and convert the table of contents to a simple xml structure. Doing this on the desktop with traditional VBA seems like it would have been easy. Looking at the WordprocessingML XML data used to create the docx file is confusing. Is there a way (without COM) to navigate the document in more of an object-oriented fashion?

I highly recommend looking into the Open XML SDK 2.0. It's a CTP, but I've found it extremely useful in manipulating xmlx files without having to deal with COM at all. The documentation is a bit sketchy, but the key thing to look for is the DocumentFormat.OpenXml.Packaging.WordprocessingDocument class. You can pick apart the .docx document if you rename the extension to .zip and dig into the XML files there. From doing that, it looks like a Table of Contents is contained in a "Structured Document" tag and that things like the headings are in a hyperlink from there. Putzing around with it a bit, I found that something like this should work (or at least give you a starting point).
WordprocessingDocument wordDoc = WordprocessingDocument.Open(Filename, false);
SdtBlock contents = wordDoc.MainDocumentPart.Document.Descendants<SdtBlock>().First();
List<string> contentList = new List<string>();
foreach (Hyperlink section in contents.Descendants<Hyperlink>())
{
contentList.Add(section.Descendants<Text>().First().Text);
}

Here is a blog post on querying Open XML WordprocessingML documents using LINQ to XML. Using that code, you can write a query as follows:
using (WordprocessingDocument doc =
WordprocessingDocument.Open(filename, false))
{
foreach (var p in doc.MainDocumentPart.Paragraphs())
{
Console.WriteLine("Style: {0} Text: >{1}<",
p.StyleName.PadRight(16), p.Text);
foreach (var c in p.Comments())
Console.WriteLine(
" Comment Author:{0} Text:>{1}<",
c.Author, c.Text);
}
}
Blog post: Open XML SDK and LINQ to XML
-Eric

See XML Documents and Data as a starting point. In particular, you'll want to use LINQ to XML.
In general, you do not want to use COM in a .NET application.

Related

Generating Excel Documents with ASP.NET Website

I have an ASP.NET application that helps the user create a Gridview with certain data in it. Once this table is generated I want the user to push a button and be able to save the table as an Excel document.There are two different methods I know of:
Using HtmlTextWriter with ContentType "application/vnd.ms-excel" to send the file as an HttpResponse. I use GridView1.RenderControl(htmlTextWriter) to render the gridview. This almost works, but the excel file always shows a warning when the file opens because the content doesn't match the extension. I have tried various content types to no avail. This makes sense I guess, because I'm using an HtmlWriter. It also doesn't seem a good practice.
The second thing I've tried is generating the Excel file using Office Automation. But for the file to be generated, I need to save it to disk and then read it again. From what I have read, this is the only way, because the Excel object only becomes a real Excel file once you save it. I found that the .saveas method from the Excel class would throw an exception because of write permissions, even if I tried to save in the App_Data folder. So I did some research and found that apparently Office Automation is discouraged for web services: https://support.microsoft.com/en-us/kb/257757
Microsoft does not currently recommend, and does not support,
Automation of Microsoft Office applications from any unattended,
non-interactive client application or component (including ASP,
ASP.NET, DCOM, and NT Services), because Office may exhibit unstable
behavior and/or deadlock when Office is run in this environment.
There surely must be a save way to have a website generate an Excel file and offer it to the user!? I can't imagine that this problem is unsolved or so rare that nobody cares about it, but yet I can't find any good solution to this.
the easiest (and best) way to create an excel file is by using epplus
Epplus sample for webapplication
using (ExcelPackage pck = new ExcelPackage())
{
ExcelWorksheet ws = pck.Workbook.Worksheets.Add("Demo");
//Load the datatable into the sheet, starting from cell A1. Print the column names on row 1
ws.Cells["A1"].LoadFromDataTable(tbl, true);
//Format the header for column 1-3
using (ExcelRange rng = ws.Cells["A1:C1"])
{
rng.Style.Font.Bold = true;
rng.Style.Fill.PatternType = ExcelFillStyle.Solid; //Set Pattern for the background to Solid
rng.Style.Fill.BackgroundColor.SetColor(Color.FromArgb(79, 129, 189)); //Set color to dark blue
rng.Style.Font.Color.SetColor(Color.White);
}
//Example how to Format Column 1 as numeric
using (ExcelRange col = ws.Cells[2, 1, 2 + tbl.Rows.Count, 1])
{
col.Style.Numberformat.Format = "#,##0.00";
col.Style.HorizontalAlignment = ExcelHorizontalAlignment.Right;
}
//Write it back to the client
Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
Response.AddHeader("content-disposition", "attachment; filename=ExcelDemo.xlsx");
Response.BinaryWrite(pck.GetAsByteArray());
}

Reading XML in a string and write to text file in c#

I have XML content that I am retrieving from a service and I want to write it into a text file. I am getting this XML content in a string variable.
How can I read this and write in text file? Please help. I am getting data successfully using this code:
string results = "";
using (WebClient Web = new WebClient())
{
results = Web.DownloadString(WebString.ToString());
}
I have tried some links, but they are not helping
forums.asp.net
Reading-XML-File-and-Writing-it-to-txt-format-in-C
If you want to save it in a file, don't use DownloadString to start with - just use WebClient.DownloadFile.
If you really want to fetch it in memory and then save it, you can save it with:
File.WriteAllText(filename, results);
Note that this code doesn't depend on it being XML at all... nothing you're asking is XML-specific.

How to effectively load in dataTable an XML string that contains html string in its nodes. asp.net 4.0

I want to load XML in a single table of a dataset. I use following code
string val = getAbonentInfoParametr(ai,"abonentDescription");
DataSet ds = new DataSet();
ds.ReadXml(new StringReader(val));
but when I do this, I got three tables because in one node of XML file I now get HTML code that I want to have like a string field in my only table. What should I do?
Also I prefer not to use scheme files because the structure of that xml file can be changeable except several field that I use, please suggest me something.
Use CDATA to wrap around the HTML, otherwise there is no way to differentiate HTML from XML.

Exporting a HTML Table to Excel from ASP.NET MVC

I am currently working with ASP.NET MVC and I have an action method that displays few reports in the view in table format.
I have a requirement to export the same table to an Excel document at the click of a button in the View.
How can this be achieved? How would you create your Action method for this?
In your controller action you could add this:
Response.AddHeader("Content-Disposition", "filename=thefilename.xls");
Response.ContentType = "application/vnd.ms-excel";
Then just send the user to the same view. That should work.
I'm using component, called Aspose.Cells (http://www.aspose.com/categories/.net-components/aspose.cells-for-.net/).
It's not free, though the most powerful solution I've tried +)
Also, for free solutions, see: Create Excel (.XLS and .XLSX) file from C#
Get data from database using your data access methods in dot net.
Use a loop to get each record.
Now add each record in a variable one by one like this.
Name,Email,Phone,Country
John,john#john.com,+12345,USA
Ali,ali#ali.com,+54321,UAE
Naveed,naveed#naveed.com,+09876,Pakistan
use 'new line' code at the end of each row (For example '\n')
Now write above data into a file with extension .csv (example data.csv)
Now open that file in EXCEL
:)

Best method to populate XML from SQL query in ASP.NET?

I need to create a service that will return XML containing data from the database. So I am thinking about using an ASHX that will accept things like date range and POST an XML file back. I have dealt with pages pulling data from SQL Server and populating into a datagrid for visual display but never into XML for delivery, what is the best way to do this? Also if an ASHX and POST isn't the best method for delivery let me know... thanks!
EDIT: These answers are great and pointing me in the right direction. I should have also mentioned that the XML format has already been decided so I can't use any automatically generated one.
Combining linq2sqlwith the XElement classes, something along the lines:
var xmlContacts =
new XElement("contacts",
(from c in context.Contacts
select new XElement("contact",
new XElement
{
new XElement("name", c.Name),
new XElement("phone", c.Phone),
new XElement("postal", c.Postal)
)
)
).ToArray()
)
);
Linq2sql will retrieve the data in a single call to the db, and the processing of the XML will be done at the business server. This splits the load better, since you don't have the sql server doing all the job.
Have you tried DataSet.WriteXml()?
You could have this be the output of a web service call.
Sql Server 2005 and above has a "FOR XML AUTO" command that will convert your recordset to XML for you. Then you just have to return a string from your ASHX.
Beginning with SQL Server 2000, you can return query results as XML. For absolute control of them, use the "FOR XML EXPLICIT" command. You can use any format you desire that way.
http://msdn.microsoft.com/en-us/library/ms189068.aspx
It's as easy as writing your result to the raw output then. For added points, you can return the result set to a XPathDocument, pass it through an XSL transformation, and send the results out in any format you choose (HTML vs XML at the click of a button perhaps).
you can obtained that to a datatable and then call myTable.WriteXML()
if you are populating classes with the your database results then add the serializable attribute to the header your classes, and use the XMLSerializer
Converting an automatically generated format into a specified one is a job for xslt. Just find a way to run the output from the tool through an xslt filter.
Oracle has a great product for doing exactly this job - the oracle XDK. But it's a java thing, not ASP as far as I know.
For an example, this XHTML
http://www.anbg.gov.au/abrs/online-resources/flora/stddisplay.xsql?pnid=2524
is generated automatically from this XML, which is generated by oracle
http://www.anbg.gov.au/abrs/online-resources/flora/stddisplay.xsql?pnid=2524&xml-stylesheet=none
Of course, you are not after XHTML, but some other XML format. But XSLT will do the job.

Resources