Converting MS Word Documents to PDF in ASP.NET [closed] - asp.net

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Similar questions have been asked, but nothing exactly like mine, so here goes.
We have a collection of Microsoft Word documents on an ASP.NET web server with merge fields whose values are filled in as a result of user form submissions. After the field merge, the server must convert the document to PDF and stream it down to the browser. Our first inclination was to use the Visual Studio Tools for Office API; however, we ran into this warning from Microsoft:
Microsoft does not currently recommend, and does not support, Automation of Microsoft Office applications from any unattended, non-interactive client application or component (including ASP, ASP.NET, DCOM, and NT Services), because Office may exhibit unstable behavior and/or deadlock when Office is run in this environment.
It looks like the field manipulation can be done using the Open XML SDK, but what's the best way to convert Word 2007 documents to PDF without opening Word? The optimal solution would be low-cost, scalable, have a low memory footprint, be easy to deploy, and have a .NET API.

It's not exactly Open Source, but Aspose has a couple products which can do that,
Aspose.Pdf.Kit
Aspose.Pdf.Kit is a non-graphical PDF® document manipulation component that enables both .NET and Java developers to manage existing PDF files as well as manage form fields embedded within PDF files. Aspose.Pdf is perfect for creating new PDF files; however, developers often need to edit already existing PDF documents. Aspose.Pdf.Kit allows them to do just that. Aspose.Pdf.Kit allows developers to create powerful applications for merging data directly into PDF documents as well as for updating and managing PDF documents. Aspose.Pdf.Kit is a wonderful product and works great with the rest of our PDF products.
and Aspose.pdf
Aspose.Pdf is a non-graphical PDF® document reporting component that enables either .NET or Java applications to create PDF documents from scratch without utilizing Adobe Acrobat®. Aspose.Pdf is very affordably priced and offers a wealth of strong features including: compression, tables, graphs, images, hyperlinks, security and custom fonts. Aspose.Pdf supports the creation of PDF files through API, XML templates and XSL-FO files. Aspose.Pdf is very easy to use and is provided with 14 fully featured demos written in both C# and Visual Basic.
Check out the API and demos. You can download a DLL for free to try it out. I've used both before and they work out great.
There's also iTextSharp which is a C# port of iText, a Java PDF converter. I've heard some people try it with mixed results.

The question is "MS Word Documents to PDF in ASP.NET" so I am very puzzled why Aspose.Pdf and Aspose.Pdf.Kit are recommended above. You need to use Aspose.Words because that's the component that supports Microsoft Word documents to PDF conversion.

Check out Microsoft's resource on Saving Word 2007 Documents to PDF and XPS Formats using C# or VB.

ActivePdf DocConverter - http://www.activepdf.com/
But it requires Office installed on the server for good quality conversion.

Aspose.Words may be the best option for you, but it doesn't convert all visual elements perfectly.
Have a look at the Muhimbi PDF Converter Web Services. It runs on Windows as a service, but can be accessed from any non-Windows web services capable environment including Java and .NET.
Although this solutions requires MS-Office to be installed on a server (not necessarily the same server as your application), it is very robust and provides perfect conversion fidelity. It goes to great lengths to get around the deadlock problems Microsoft refer to in their KB article.
To generate or Modify MS-Word files I recommend using the free Open XML SDK for Microsoft Office. Eric White maintains a really good Blog about it.
Disclaimer, I worked on this product. Having said that, it works great.

You should try using OpenOffice for this. It is Free and supports a whole range of file conversions. I have used it to convert DOC & DOCX files to HTML format with fantastic results.

ABCpdf is another popular component that'll let you convert Word documents to PDF under ASP.NET, however I believe it too makes use of Microsoft Office or OpenOffice.
http://www.websupergoo.com/abcpdf-office-docs.htm

Microsoft PDF add-in for word seems to be the best solution for now but you should take into consideration that it does not convert all word documents correctly to pdf and in some cases you will see huge difference between the word and the output pdf. Unfortunately I couldn't find any api that would convert all word documents correctly. The only solution I found to ensure the conversion was 100% correct was by converting the documents through a printer driver. The downside is that documents are queued and converted one by one, but you can be sure the resulted pdf is exactly like the word docuemtn. I personally preferred using UDC (Universal document converter) and installed Foxit Reader(free version) on server too then printed the documents by starting a "Process" and setting its Verb property to "print". You can also use FileSystemWatcher to set a signal when the conversion has completed.

Related

Programmatic access to the Word/XLS/PPT Accessibility Checker

I am working on a web application that takes MS documents(word, excel, ppt) as input documents and generates PDF documents, while it's possible to create the accessible PDF using the API/library that I am currently using, I was looking for an API/Library that will help me scan the input document(word, ppt, excel) for accessibility compliance.
As if the input document itself is lacking the semantic meta-data for accessibility the resulting PDF will not be accessible.
MS Word itself has a scripting interface for VBscript (Windows/Mac) and AppleScript (Mac only). Not sure how far you can get with those, but I seem to remember that they both expose a lot of stuff about Word documents, so this is a possible pathway.
libreoffice has a python scripting interface - this may be another viable approach.
There are certainly command-line tools which can manipulate word files in various ways. Try this post:
Creating & Editing MS-Word documents on a linux server?

Edit in Word using Wopi and Office Online Server

I am working on a project where we have implemented content management with word.
We have some word files, that are being processed using OpenXML.
Users can open those files in two ways - download a copy or edit online. Online editing is implemented using Office Online Server and custom Wopi server, built based on this example.
Editing online works fine, but Word Online has limited features compared to desktop Word.
I am trying to build a functionality similar to Sharepoint, where user has 2 options - Edit in Word, Edit in Browser:
In Office Online Server I don't have such options, I can only edit in browser:
Even in edit mode Sharepoint provides a link for Edit in Word:
whereas Office Online Server does not have it:
My question is how it is implemented in Sharepoint?
In other words, am I missing something in Wopi server to enable it or Microsoft has built this functionality into Sharepoint, without the need of Wopi and/or OWA?
Any ideas would be appreciated!
To enable "Edit in Word" in Office Online Server when using a WOPI handler, you need to set the ClientUrl property in CheckFileInfo (and CheckFolderInfo if you implement that). ClientUrl should be set to a direct editable link for the document file, either WebDAV or FSHTTP, but you could even use a file:// link for testing.
When you set the ClientUrl property, Office Online behavior becomes very similar to OneDrive/SharePoint Online. The current WOPI documentation is a bit outdated, it lists this property under Unused and future properties, but there is nothing secret about it. I asked dochelp#microsoft.com, that is Microsoft's "Open Specifications Support" mailbox, mentioned in many of their presentations and publications about WOPI and Office Online.
Word Online Reading View:
Word Online Editing View after clicking OPEN IN WORD:
I'm pretty sure that the functionality (Edit in Word) is not part of the Office Online Server and that it doesn't utilize the WOPI protocol. In the previous versions of SharePoint, it was implemented using WebDAV and I guess this hasn't changed. If you want to support opening/editing/saving you should implement your own WebDAV server. You can save a lot of time if you use a pre-built server like one from ITHit. They also have a JS framework to support opening files from browser.
If you want a cheap, cross-browser alternative that will just invoke the editing apps I suggest you have a look at Office URIs.

ASP.NET pdf converter

I am looking for easy solution to convert documents from one format(doc, html, xls...) to pdf in ASP.NET.
Is the iTextSharp a good choise? Can iTextSharp convert documents from one format to pdf?
What library can your suggest me to use?
I've been using winnovative for all my PDF generation for the past few years:
http://www.winnovative-software.com/
Fair few good features, and simple to implement, if you don't mind paying for a license.
The default standard for this task should be Microsoft Office SharePoint Server. Another option would be using Microsoft Office applications from ASP.NET with Automation, combined with a PDF Printer (you will need a copy of Microsoft Office installed on the server). There are many PDF printers outthere (Cute PDF for example), but if you can afford a commercial option I recommend Amyuni PDF Converter. There are samples of Word/Excel to PDF conversion using Office+Automation with this product.
I'm working as Developer Evangelist with Aspose. And I would like to share with you that you may try Aspose.Total for .NET product suite, which allows you to convert various file formats (DOC/DOCX/PPT/PPTX/XSL/HTML etc.) into PDF format. You may also select components of your choice. Complete samples, tutorials and support are also available for these components.
Please note that these components are standard .NET assemblies and you can use them either in ASP.NET or Windows Forms applications.
Give the Muhimbi PDF Converter Services a look. It installs in your environment as a scalable and robust Windows Service and has specifically been designed for use from server based applications such as ASP.NET.
It comes with a friendly web services based interface that allows it to be used from most modern environments such as Java and .NET. It supports all common as well as some not so common file formats. Watermarking and PDF Security is included as well. If you have SharePoint in your environment then a SharePoint optimised version is available as well.
Disclaimer, I have worked on this product so the usual disclaimers apply. Having said that, it works great.

Is it possible to automate Visio with ASP.NET?

My clients are trying to revive an ASP.NET 1.0 application (yes, you read that right) that generated data-driven Visio Gantt diagrams. I have access to the code (VB.NET), but there are no notes, comments, or documentation, and no employees from 2003 still around. Compounding the issue, I'm pretty new on the scene (ASP.NET 3.5+ only), so the project structure looks very foreign to me (.resx files?).
I've tried including Visio Interop libs with little success. I tried following this article , but when adding the MS Visio 12.0 type library reference to the project solution in VWD Express 2010, I get an error that reads, "A reference to "Microsoft Visio Viewer 12.0 Type Library" could not be added. Converting the type library to a .NET assembly failed. No process is associated with this object." I don't know what that means but I sense it'll be a huge headache to resolve.
At this point I'm stuck and considering porting this feature to more a current platform. Can anybody suggest anything?
Visio has an xml format (.vdx)
If you don't need Visio to help you with layout or connections, you might be able generate the xml files, then have your ASP app serve them up as consumable Visio files.
If you need Visio's Gantt-chart add-in features, or Visio's export to web or image features, then this might not be the way to go. But if you only need to place shapes on a page, set text and other data fields, and have a fairly simple layout and simple connecting lines, you should be able to go this route.
The last download link in this article is for a presentation on Visio and XML that I gave a while back:
http://www.visguy.com/2006/11/30/visio-and-xml-conference-resources/
You should not access the automation API of any Office program from ASP.NET or any other server environment. It is unsupported, will fail at random, and may cause you to violate the terms of your license with Microsoft.
Tell them "no". You'll be much happier.
Well, you may download and use Aspose.Diagram library. It works with Microsoft Visio files without the installation of Microsoft Office Visio. Developers can create, open and manipulate the elements of diagrams and export to many other supported file formats. Based on your scenario, you may get details from a database backend and then create Visio diagrams. It is achievable using Aspose.Diagram API. Please refer to the technical resources of Aspose.Diagram for .NET API.
I work as a Developer Evangelist at Aspose.

Best solution for exporting Word documents to PDF programatically (without using a "software printer")?

I'm looking for a way to export a Word document as a PDF. I would like to do this without the use of a "software printer" (such as CutePDF, etc.) and stick to reference assemblies if at all possible. I'm using Microsoft Office Interop Assemblies to generate a Word Document which I save to a temporary directory. So its not necessary for this solution to interact directly with Microsoft Office, unless it needs to.
Office 2007 has a built-in (or add-on) converter to PDF, therefore you can save office 2007 files to PDF without much hassle.
Otherwise, you'll have to use some sort of conversion assembly (there should be commercial assebmlies that perfrom this task), or a converstion application that can receive command-line arguments, or maybe even some web-based service for office-to-pdf conversion.

Resources