Generating pdf reports containing text and chart based data - asp.net

I have to generate downloadable good looking pdf reports containing text and chart based data for a commercial application. I found out itextsharp library ( http://itextsharp.com/ ) but it isn't as powerful as I expect.
What components do you use for this kind of requirements? The price is important for me too. I'd be grateful for your advices.
Thanks in advance,

Have a look at Aspose.PDF or alternatively Tallcomponents. Not sure if they support generating charts out of the box, but for that you could other libraries like DotNetCharting or XtraCharts.

If you are ok to look for commercial solutions for PDF then I would suggest PageFlex. It is very powerful. Check that if it suits your needs.

I have used ActivePdf in the past to generate reports which contained all types of charts and data. Basically it takes an html page and converts it to PDF so we just had to develop HTML versions of the reports and then if the user wanted a pdf version we just had to create a PDF from the html version of the report. The ActivePdf server had its quirks but it did solve the problem quite well.

DocRaptor.com is a great tool for generating pdfs from HTML. It uses Prince XML, and that quality is hard to beat.

Related

PDF to HTML or similar

I'm building an application to view pdf's through a browser without the need of a plugin on mobile devices. I tried ImageMagick and ghostscript to covert the pages to images but they are far too large and text becomes unclear. I see website offering a service of converting pdf's into html and do a descent job but I can't find an example of how this is accomplished. Any help is much appreciated. Thanks!
EDIT: I seem to have read the question backwards. In this case it might be best to parse through the PDF and then format some HTML based on what you find. I believe the javapdf option is capable of this, but I haven't used any of these so I am not sure. If worse comes to worst and you can't find software to disassemble a PDF, you might be able to write your own disassembler in Java or PHP by reading the PDF specification. Best of luck!
http://www.adobe.com/devnet/pdf/pdf_reference.html - PDF Specification (Adobe Modified Version, because they are most popular you may want to support their extensions)
-- OLD -- These websites probably write their own proprietary software to do the trick. If you are truly interested in this undertaking, I would suggest parsing the HTML to get the data and style information and using it to format some sort of PDF writer APIs. A quick Google search yields the following: -- END OLD --
http://www.cutepdf.com/Solutions/
http://ruby-pdf.rubyforge.org/pdf-writer/doc/index.html
http://asprise.com/product/javapdf/
If you are looking at converting PDF to HTML and planning to run the conversion on a server, then you can try pdf2html. It is a program packaged as part of poppler-utils. I do not know how the program accomplishes it.
I was googling and came across the below link explaining how scridb.com implements conversion.
http://coding.scribd.com/2010/06/01/the-perils-of-stacking/

Cross-platform end-user-help authoring tools

What are some good authoring tools for creating cross-platform help files for end-users? (Our application is using the Qt framework, if that makes any difference.)
Note: I'm not interested in internal API documentation--we're using doxygen for that.
Ideally, a solution would:
Allow us to manage all help content (text, table of contents, images, etc.) in a single location.
Output to native help formats. (CHM for Windows--or at least something we could feed directly into the HTML Help API; not sure what other platforms' "standard" help formats are.)
Decent WYSIWYG support: handle common text entry, images, cross-references, etc. easily, but we can edit the HTML when we need to.
Text-based file-format for help project (XML, etc.) so that it can be versioned in Subversion.
Any hooks that help keep it in synch with the actual code base would be great. (Perhaps somehow a help topic is associated with a code file, and can check Subversion to see if any changes have been made and flag a topic as "possibly out of date" ... am I dreaming?)
Help content can be localized.
Not opposed to commercial product, but a free option would be nice.
I'll go ahead and make this a wiki and start with a few examples. Vote 'em up or down if you have experience with them, and leave some comments. Add additional tools as well.
I just discovered Sphinx; I think I'm in love.
Better than WYSIWYG over HTML: reStructuredText
Outputs to QtHelp (among other things), so will be easily to distribute (and integrate) in our application.
Not sure about localization yet, but we'll cross that bridge when we need to.
Was easy to set up and "just works"; looks professional.
I have used robohelp for years.
It is fine, but the core technology is very old now. Also the way they lock to Word versions is a total PITA (and has forced me to avoid MS office upgrades several times).
We are moving to madcap flare http://www.madcapsoftware.com/products/flare/robohelp.aspx
I think DocBook addresses all you requirements except possibly the synchronisation hooks, which I'll think a bit further on. It's essentially a subset of XML designed for creating documentation, and is free and open source. It's just a format plus a set of XSL output transforms that convert the Docbook into more useful formats (HTML and thus CHM, JavaHelp, PDF via XML-FO or Tex).
This means that you still need to choose an XML authoring tool to actually edit it so things like WYSIWYG will depend on the features of your XML authoring software. We use Syntext Serna as it has good support for WYSIWYG and inline editing of XML #includes (no-one else seems to support the latter). You may find other XML authoring tools better suit your needs - Serna is an reasonably pricey commercial offering.
Docbook provides a lot of flexibility via profiling, which allows you to include/exclude xml elements based on their attributes. Example use cases would be to have slightly different help output for OS=Windows than OS=Linux. Localization is also supported via profiling and other mechanisms.
A fairly good introduction to Docbook can be found here.
We use Docbook for our help format, and compile it to CHM files that contain help only for the features relevant to a specific product (ie Enterprise edition has features that aren't in the Standard or Demo versions). The relevant steps are:
Run the Profiling XSL templates on the XML Source (using eg XSLTproc).
Run the HTML-Help XSL templates on the output of 1.
Compile the output HTML files using Microsoft's HTML Help Compiler (HHC).
Help & Manual
Robohelp
The only one I know is Latex, one of the latex2html converters, and then a few adaptation to make the resulting html ready for the CHM archiver.
text,html,chm,pdf, ps no problem.
Converting to Word via RTF used to be a disaster, don't know current status.
latex 2 html converters, while several, all have their own problems.
The pdfs look absolutely great.
WYSIWYM (via lyx) possible.
This archive has a bunch of CHMs that way (notably the prog,ref and user parts, the rest (rtl,fcl,lcl) are generated by our own doxygen equivalent, fpdoc)
http://www.stack.nl/~marcov/doc-chm.zip
Note that the above CHMs are made with our own (portable) CHM compiler. Yes, no more workshop.
A Lyx document as PDF and html:
pdf: http://www.stack.nl/~marcov/buildfaq.pdf
html: http://www.stack.nl/~marcov/buildfaq/

ASP.NET library to extract plain text from Open XML file formats

Is there a pre-existing library to extract plain text form Open XML file formats (e.g. docx, pptx, and xlsx) files?
I require this to populate a lucene.net index.
I've found this example which extracts text from docx and it seems to work okay. But before building my own solution based on this I was wondering if there's something already available for the other file formats?
Before spending cash, it may be worth looking at the IFilter interface - these were/are designed to do exactly what you want.
http://msdn.microsoft.com/en-us/library/ms691105
http://www.codeproject.com/KB/cs/IFilter.aspx
(Some links at the bottom of the codeprject link).
MS provide IFilters for office file types.
http://www.microsoft.com/downloads/details.aspx?familyid=60c92a37-719c-4077-b5c6-cac34f4227cc&displaylang=en
I know that we use this technology to allow us to index PDFs using Lucene but I did not write the actual code and cannot be of much use I am afraid.
If your Google-fu is strong I am sure you can dig up more examples of using IFilters to do exactly what you want.
watch aspose.com, they have a good library to handle both ppt and pptx.
You can try Toxy, an open source text/data extraction framework for .NET. For now, it supports xls, xlsx, doc, docx. It will support pptx in version 1.5 very soon.
For detail, you can check here

Printing PDf issue

Most of the print pdf library I ran into requires drawing tables, layouts etc. Which library can simply print the web page in pdf format without requiring too much coding? Any pointers will be greatly appreciated
Free* .Net Tool:
ABC PDF
*ABCpdf is normally priced at $329. However as a special offer we'll give
you a free license key - all you have
to do is link back to our web site...
The best solution that is free that I've located is this:
http://code.google.com/p/wkhtmltopdf/
http://www.rustyparts.com/pdf.php (PHP)
The best non-free solution is here:
http://www.html-to-pdf.net (.NET)
http://www.corda.com/java-pdf.php (Java)

LaTeX equivalent to Google Chart API

I'm currently looking at different solutions getting 2 dimensional mathematical formulas into webpages. I think that the wikipedia solution (generating png images from LaTeX sourcecode) is good enough until we get support for MathML in webbrowsers.
I suddenly realized that it might be possible to create a Google Charts API equivalent for mathformulas. Has this already been done? Is it even possible due to the strange characters involved in LaTeX-code?
I would like to hit an url like latex2png.org/api/?eq="E = mc^2" and get the following response:
edit:
Thanks for the answers sofar. However, I am already aware of several tools to generate png images from latex source code (both online and from my commandline), but what I was looking for was a simple way to get the image via an Http GET request. Perhaps such a service does not exist.
Update
As #hughes (and others) pointed out, the previous Google Chart API has been deprecated.
The example I wrote still works as of Sept 2015, but a new one shall be used now (documentation):
Old answer
Google Chart can do it (Documentation):
http://chart.apis.google.com/chart?cht=tx&chl=%5CLaTeX
I'm using this with Google Docs, because it doesn't support math yet.
chart.apis.google with background color changed
https://chart.apis.google.com/chart?cht=tx&chf=bg,s,FFFF00&chl=%0D%0A4x_0%5CDelta%28x%29%2B3%5CDelta%28x%29%2B2%5CDelta%28x%5E2%29%3E0%0D%0A
or chart.apis.google with background color transparent and resized
For better readability URL needs to be decoded.
https://chart.apis.google.com/chart?cht=tx&chs=428x35&chf=bg,s,FFFFFF00&chl=
4x_0\Delta(x)+3\Delta(x)+2\Delta(x^2)>0
Data structure looks like this
{
"cht":"tx",
"chs":"428x35",
"chf":"bg,s,FFFFFF00",
"chl":"n4x_0\Delta(x) 3\Delta(x) 2\Delta(x^2)>0"
}
https://chart.apis.google.com/chart?cht=tx&chs=428x35&chf=bg,s,FFFFFF00&chl=%0D%0A4x_0%5CDelta%28x%29%2B3%5CDelta%28x%29%2B2%5CDelta%28x%5E2%29%3E0%0D%0A
You could try the Online image generator for mathematical formulas for a start.
mathurl is a mathematical version of TinyURL.com. It allows you to reference LaTeXed mathematical expressions using a short url. For example, http://mathurl.com/?5v4pjw will show [LaTeX output Image] which you can then edit. More details on mathurl’s help page
I just ran across MathJax on Ajaxian [via Wayback Machine]:
MathJax seems to have a chance at being a practical solution that offers a high quality display of LaTeX and MathML math notation in HTML pages.
The output is remarkably beautiful, and it's all pure HTML and CSS, which makes it scalable and selectable. Performance is currently a bit sluggish, but this is recognized.
As everyone has said, there are many services that do this already. Here is another easy one that I've used a number of times (and you can install it locally on your server if necessary):
http://www.codecogs.com/components/equationeditor/equationeditor.php
I'd take a good look at how the MediaWiki LaTeX support does it and borrow from there.
Please check out this site for a way to create TeX documents without any software installed. You can then snippet the result image with any screen capture method and embed the resulting image into a any website.
Go to http://sharelatex.com
The software is free to use, but you need to register to create documents.

Resources