Exporting web applications pages to excel/pdf - asp.net

We have a requirement to export different pages of our I.E. only web application to Excel/pdf documents.
The pages have graphics/grids/text, etc...They should also be printable as well.
I heard weSuperGoo mentioned, but have no experience with it.
I am in the research phase now and I wonder what tools/technologies/methods are out there for this task?
I would appreciate any pointers/direction.
Thanks!

We have used ABCpdf by WebSupergoo which includes the ability to retrieve a URL and convert it to a PDF (see documentation). This means all we need to do is provide a suitably formatted version of the page in plain old HTML and point ABCpdf at this URL and it will convert everything automatically for us - beats having to build the page up manually element by element.
I should add that this isn't perfect - we have had some issues relating to matters like paging (very difficult to page HTML when you need things like headers and footers on every page) but for simple uses it's up to the job.
You can get ABCpdf free if you're prepared to link to them.

To export to Excel, you can simply just export a HTML table as HTML and name the file whatever.xls. Excel will automatically convert the HTML table to a spreadsheet. I've been using that trick for many, many years. If you're using something like a DataGrid, then that makes it even easier to just write out the contents of the control to a HTML file (or string) and then return it as a .xls file.
For PDF, I recommend iTextSharp. It's really easy to use and has worked well for me for many years. You can use the iText (Java version) documentation or the iTextSharp documentation, the methods and classes are the same (maybe capitalization is different, but you should be able to figure it out.)
Links
http://itextsharp.sourceforge.net/

Related

Convert webpage from HTML to PDF?

I have a website with the following structure:
Tab Container - having 4 Tab panels
Each tab panels is having 4 gridviews which are separated by line break.
Now when i am in a particular tab, I want to use an 'export to pdf' button , which should generate a pdf having 4 gridviews visible in this tabpanel. Same for all other tabpanels.
I have searched enough, found may articles telling about using itextsharp, wkhtmltopdf, pdf generators etc, however I dont seems to find fully implemented functionality anywhere.
Can anyone guide/suggest anything ?
I always use wkhtmltopdf to convert a html page to pdf. (you will need server access to install it though)
It works very well, looks the same as the web site and saves text as actual text (in vectors).
I've used CutePDF's API and they seem to work pretty well.
http://www.cutepdf.com/Solutions/
You can do this in two ways, either handle it on your server or use a third party service.
If you want to convert a html page to a PDF on your server, you can use wkhtmltopdf (A simple shell utility to convert html to pdf using the webkit rendering engine, and qt.) I haven't used it with .NET however have seen many examples.
If you like to use a third-party service www.impdf.com could be used, It's a free service. You do not need to register even. I once have used it but not for a long time( I later switched to wkhtmltopdf get some performance gain).
It depends on your requirements which method you must use. In any case if using impdf is enough for you,
Convert this page to a PDF
A4 page: impdf.com?url=http://www.yourwebsite.com&--page-size=A4
Letter page: impdf.com?url=http://www.yourwebsite.com&--page-size=Letter
Adobe ColdFusion has a tag called <CFPDF> built in.
http://help.adobe.com/en_US/ColdFusion/10.0/CFMLRef/WSc3ff6d0ea77859461172e0811cbec22c24-7995.html
Furthermore it has web services which which can bridge the gap to ASP.Net

How do I output HTML form data to PDF?

I need to collect data from a visitor in an HTML form and then have them print a document with the appropriate fields pre-populated. They'll need to have a couple of signatures on the document, so it has to be printed.
The paper form already exists, so one idea was to scan it in, with nothing filled out, as an image. I would then have the HTML form data print out using CSS for positioning and using the blank scanned form as a background image.
A better option, I would think, would be to automatically generate the PDF with this data, but I'm not sure how to accomplish either.
Suggestions and ideas would be greatly appreciated! =)
I would have to respectfully disagree with Osvaldo. Using CSS to align on a printed document would take ages to do efficiently in the aspect of cross-browser integration. Plus, if Microsoft comes out with a new browser, you're going to have to constantly update for the new use in browsers.
If you know any PHP (Which, if you know JavaScript and HTML, basic PHP is very simple), here's a good library you can use, FDPF:
Thankfully, PHP doesn't deprecate a whole lot of methods and the total code is less than 10 lines if you have to go in and change things around.
You can control printed documents acceptably well with CSS, so I would suggest you to try that option first. Because it's easier.
This is actually a great php library for converting HTML to PDF documents http://code.google.com/p/dompdf/ there are many demo's available on the site
XSL-FO is what I would recommend. XSL-FO (along with XSLT and XPath) is a sub-standard of XSL that was designed to be an abstract representation of a formatted document (that contains, text, graphic elements, fonts, styles, etc).
XSL-FO documents are valid xml documents, and there exist tools and apis that allow you to convert an XSL-FO documet to MS Word, PDF, RTF, etc. Depending on the technology you use, a quick google search will tell you what is available.
Here are a few links to help you get started with XSL-FO:
http://en.wikipedia.org/wiki/XSL_Formatting_Objects
http://www.w3schools.com/xslfo/xslfo_intro.asp
http://www.w3.org/TR/xsl11/

Create pdf document in asp.net 2.0

I am using asp.net 2.0 with c#.
I have to convert my label text into pdf. For this I have used this tutorial
http://www.codeproject.com/KB/aspnet/Creating_PDF_documents_in.aspx
now I am facing two problems:
Every time it is creating 1.pdf, what if there are so many user wants to see the the pdf format of any page
As my label text contains HTML content, it is showing a HTMl output. I don't want HTML to be display in the pdf.
please let me know if you have any other way to create a pdf.
Thanks in advance.
Creating a PDF with HTML-formatted content is not entirely trivial, and the CodeProject sample code isn't quite suitable for that. You'll most likely want to look into a (commercial) third-party solution for this: I myself use Siberix Report Writer: it's flexible, quite affordable, works in partial-trust scenarios (nice for shared web hosting environments) and most importantly doesn't require a per-server license, so you can embed it in your product without redistribution issues.
Item 1) You cache your pdf files to disk. When a request is made for a pdf check if the pdf has been created (i.e. there is a file on disk) and if not generate it. Then send the pdf using the response.writefile command
Item 2) If you are trying to print formatted html into pdf then you will need something that is capable of rendering html. There are a number of html to pdf converters however I have not found them to be all that good. If you are comfortable with php then there are some pretty good converters you can use. Joomla supports html to pdf, so whilst it may not be the exact solution it maybe a good starting point.
I would also suggest you take a look at Aspose PDF.
I would suggest using RDLC Report or Crystal Reports as suggested by #Jeroen
Cete Pdf Generator has HtmlTextArea element and supports some limited HTML
http://www.cete.com/support/net_help_library/html/ceTeDynamicPDFPageElementsHtmlTextArea.htm
ABCpdf is another commercial component which converts html to pdf.
http://www.websupergoo.com/abcpdf-5.htm
You could try the itextsharp library. I've not used it but it has been highly commended by other developers I know. http://sourceforge.net/projects/itextsharp/
In regards to the caching issue. I would check the file system for a pdf named via a convention. If the file is found then serve it. Otherwise, call another method which generates the pdf and saves it to the drive. This way only the first ever request will cause the generation of the pdf. Naming conventions will be key here. The basic implementation wont be thread safe. But it's a good start.
I use CrystalReports. It can create a PDF on the fly and output it to disk or http directly.

ASP.NET website, server-side DOCX to PDF conversion

I've been having a heckuva time with this problem, and there seems to be a lot of noise out there in search engines in getting to the bottom of it, so forgive me if I've missed a silver bullet out there.
The base need is that I have to generate a PDF document that has both static and dynamic elements. I started to do this by having a PDF template with all the static content, and then I wanted to inject various dynamic elements into it. The problem is that PDFs are not meant to be manipulated that way, and depending on the size of the dynamic text I put in there, might overflow text on other pages. I was using iTextSharp but can't get past this problem.
A possible fallback is to generate a DOCX, which I've done before, and then convert it into a PDF on the backend. The only libraries I've found to do this are paid apps (like Aspose). There are examples out there that convert to PDF without these libraries, but they seem to require a client-side application. I'm doing this via IIS.
To make a long story longer...are there free libraries that will convert a DOCX file to a PDF server-side without launching client applications to do so?
There are a few choices here:
build a COM interop class that will perform read and 'Save As' functions on your .docx. The MSDN link you gave doesn't require to be run client-side, but rather have the Office assemblies in the GAC or in your ASP.NET's bin directory.
buy a third party component to do the work for you. Here's just one example with no guarantees.
I'm not familiar with any good free ones, but we used Aspose.Words to achieve something similar to what you describe. We keep Word templates with static text and mail-merge fields. The templates can be regular Word documents, they don't have to be .dot templates. Mail-merge fields can be either single fields or repeatable data in tables so you can easily generate pretty complex documents without doing dynamic document editing. (Which is always an option)
Using Aspose for this was so friction free that I would suggest using Aspose unless the cost (which is significant) is a show-stopper. The support is also good which is always an added bonus.
There are always some caveats...
I would have liked more control over the PDF compatability of the generated PDFs. We had some issues with older clients reading the generated PDFs.
Mail-merge is not fun. Complex mail-merge expressions was time consuming to get right.
I just found very simple solution to convert any files from command-line using LibreOffice:
soffice.exe --headless --convert-to pdf file.xls
(google for the rest)

What is a good, free solution for Richtext editor and convertion to HTML?

Simple situation. I'm trying to write my own blog with a minor twist. Part of the blog will be controlled from a client application instead of a web interface. Basically, I'm still in the design phase and haven't written a single line of code. But I'm trying to combine several techniques into an interesting exercise in software development. Thus I want a client application which I can use to write articles in, which can then upload the article through a web service to the server. (The client would be Delphi 2007/WIN32 and the service is ASP.NET/C# with SQL Server.)
The article itself would be stored in RTF format, including images. This would be in a local database on the client, which would also keep track of the article's status. Once uploaded, it will keep the article synchronised with the version on the server. Technical details are just boring and as said before, still in a design phase...
But I do need a good solution to convert the article from RTF in the database to HTML to be displayed in the blog. I have two options:
Upload both the RTF and HTML from the client, with the client doing the convertion from RTF to HTML.
Upload just the RTF and let it convert on demand on the server. (Or convert on the server when the RTF is uploaded.)
Option 1 would need a Delphi/WIN32 solution to convert it while option 2 would need a .NET solution for the conversion. I don't want an RTF editor for .NET but need a good option to use in Delphi 2007. And I need something to convert an RTF to HTML, which would keep (almost) all formatting and which would include all images from the text. This could be both in .NET or Delphi.
So, I have the following questions:
Is there a good, free RTF editor for Delphi which can handle images?
Is there a good RTF-to-HTML converter for Delphi or C# which can keep as much of it's formatting intact as possible, including images?
Some good suggestions for .Net:
Convert Rtf to HTML
Since you provided so much background about why you are doing it, I am going to provide some feedback on the whole plan. This may not be an answer to your question directly though. Sorry.
You might consider looking at Windows Liver Writer for the client. If you just implement an API it supports then it can do all the editing.
Also, I would suggest skipping RTF all together. Converting from RTF to HTML will loose some formatting, and typically create sub-optimal HTML. Creating an RTF with the sole intent of converting to HTML is a less than optimal solution.
Instead keep it HTML for the round trip. If you must use RTF, then limit the RTF formatting to the HTML formatting you want to support. That way the conversion will be more accurate. Then convert as soon as possible, providing a preview for the poster. Since it won't always convert accurately you want the poster to see any of the conversion oddities before they make them public. That way they can fix them before they are embarrassed.
You'd better take a look at TRichEditWB component in EmbeddedWeb component pack. The whole pack is open-source:
http://www.bsalsa.com/forum/forumdisplay.php?f=29
You can add image, and even controls like buttons and checkboxes to TRichEditWB. It also can hilight HTML and XML code, and recognize URLs automatically.

Resources