Batch .pdf files generation using seam <p:document> - xhtml

I want to generate a big pdf file using Seam.
Seam have .xhtml support for pdf generation with tag but this facility only work when there is a browser interaction.
I didn't find any way to use this seam pdf facilities as a background/batch way.

There is a solution here.
https://community.jboss.org/thread/180380#comment24174
And a jira here to make it easier. But since nothing done since 2008 might not be done :)
https://issues.jboss.org/browse/JBSEAM-2613

Related

PDF to HTML or similar

I'm building an application to view pdf's through a browser without the need of a plugin on mobile devices. I tried ImageMagick and ghostscript to covert the pages to images but they are far too large and text becomes unclear. I see website offering a service of converting pdf's into html and do a descent job but I can't find an example of how this is accomplished. Any help is much appreciated. Thanks!
EDIT: I seem to have read the question backwards. In this case it might be best to parse through the PDF and then format some HTML based on what you find. I believe the javapdf option is capable of this, but I haven't used any of these so I am not sure. If worse comes to worst and you can't find software to disassemble a PDF, you might be able to write your own disassembler in Java or PHP by reading the PDF specification. Best of luck!
http://www.adobe.com/devnet/pdf/pdf_reference.html - PDF Specification (Adobe Modified Version, because they are most popular you may want to support their extensions)
-- OLD -- These websites probably write their own proprietary software to do the trick. If you are truly interested in this undertaking, I would suggest parsing the HTML to get the data and style information and using it to format some sort of PDF writer APIs. A quick Google search yields the following: -- END OLD --
http://www.cutepdf.com/Solutions/
http://ruby-pdf.rubyforge.org/pdf-writer/doc/index.html
http://asprise.com/product/javapdf/
If you are looking at converting PDF to HTML and planning to run the conversion on a server, then you can try pdf2html. It is a program packaged as part of poppler-utils. I do not know how the program accomplishes it.
I was googling and came across the below link explaining how scridb.com implements conversion.
http://coding.scribd.com/2010/06/01/the-perils-of-stacking/

Cross-platform end-user-help authoring tools

What are some good authoring tools for creating cross-platform help files for end-users? (Our application is using the Qt framework, if that makes any difference.)
Note: I'm not interested in internal API documentation--we're using doxygen for that.
Ideally, a solution would:
Allow us to manage all help content (text, table of contents, images, etc.) in a single location.
Output to native help formats. (CHM for Windows--or at least something we could feed directly into the HTML Help API; not sure what other platforms' "standard" help formats are.)
Decent WYSIWYG support: handle common text entry, images, cross-references, etc. easily, but we can edit the HTML when we need to.
Text-based file-format for help project (XML, etc.) so that it can be versioned in Subversion.
Any hooks that help keep it in synch with the actual code base would be great. (Perhaps somehow a help topic is associated with a code file, and can check Subversion to see if any changes have been made and flag a topic as "possibly out of date" ... am I dreaming?)
Help content can be localized.
Not opposed to commercial product, but a free option would be nice.
I'll go ahead and make this a wiki and start with a few examples. Vote 'em up or down if you have experience with them, and leave some comments. Add additional tools as well.
I just discovered Sphinx; I think I'm in love.
Better than WYSIWYG over HTML: reStructuredText
Outputs to QtHelp (among other things), so will be easily to distribute (and integrate) in our application.
Not sure about localization yet, but we'll cross that bridge when we need to.
Was easy to set up and "just works"; looks professional.
I have used robohelp for years.
It is fine, but the core technology is very old now. Also the way they lock to Word versions is a total PITA (and has forced me to avoid MS office upgrades several times).
We are moving to madcap flare http://www.madcapsoftware.com/products/flare/robohelp.aspx
I think DocBook addresses all you requirements except possibly the synchronisation hooks, which I'll think a bit further on. It's essentially a subset of XML designed for creating documentation, and is free and open source. It's just a format plus a set of XSL output transforms that convert the Docbook into more useful formats (HTML and thus CHM, JavaHelp, PDF via XML-FO or Tex).
This means that you still need to choose an XML authoring tool to actually edit it so things like WYSIWYG will depend on the features of your XML authoring software. We use Syntext Serna as it has good support for WYSIWYG and inline editing of XML #includes (no-one else seems to support the latter). You may find other XML authoring tools better suit your needs - Serna is an reasonably pricey commercial offering.
Docbook provides a lot of flexibility via profiling, which allows you to include/exclude xml elements based on their attributes. Example use cases would be to have slightly different help output for OS=Windows than OS=Linux. Localization is also supported via profiling and other mechanisms.
A fairly good introduction to Docbook can be found here.
We use Docbook for our help format, and compile it to CHM files that contain help only for the features relevant to a specific product (ie Enterprise edition has features that aren't in the Standard or Demo versions). The relevant steps are:
Run the Profiling XSL templates on the XML Source (using eg XSLTproc).
Run the HTML-Help XSL templates on the output of 1.
Compile the output HTML files using Microsoft's HTML Help Compiler (HHC).
Help & Manual
Robohelp
The only one I know is Latex, one of the latex2html converters, and then a few adaptation to make the resulting html ready for the CHM archiver.
text,html,chm,pdf, ps no problem.
Converting to Word via RTF used to be a disaster, don't know current status.
latex 2 html converters, while several, all have their own problems.
The pdfs look absolutely great.
WYSIWYM (via lyx) possible.
This archive has a bunch of CHMs that way (notably the prog,ref and user parts, the rest (rtl,fcl,lcl) are generated by our own doxygen equivalent, fpdoc)
http://www.stack.nl/~marcov/doc-chm.zip
Note that the above CHMs are made with our own (portable) CHM compiler. Yes, no more workshop.
A Lyx document as PDF and html:
pdf: http://www.stack.nl/~marcov/buildfaq.pdf
html: http://www.stack.nl/~marcov/buildfaq/

ASP.NET library to extract plain text from Open XML file formats

Is there a pre-existing library to extract plain text form Open XML file formats (e.g. docx, pptx, and xlsx) files?
I require this to populate a lucene.net index.
I've found this example which extracts text from docx and it seems to work okay. But before building my own solution based on this I was wondering if there's something already available for the other file formats?
Before spending cash, it may be worth looking at the IFilter interface - these were/are designed to do exactly what you want.
http://msdn.microsoft.com/en-us/library/ms691105
http://www.codeproject.com/KB/cs/IFilter.aspx
(Some links at the bottom of the codeprject link).
MS provide IFilters for office file types.
http://www.microsoft.com/downloads/details.aspx?familyid=60c92a37-719c-4077-b5c6-cac34f4227cc&displaylang=en
I know that we use this technology to allow us to index PDFs using Lucene but I did not write the actual code and cannot be of much use I am afraid.
If your Google-fu is strong I am sure you can dig up more examples of using IFilters to do exactly what you want.
watch aspose.com, they have a good library to handle both ppt and pptx.
You can try Toxy, an open source text/data extraction framework for .NET. For now, it supports xls, xlsx, doc, docx. It will support pptx in version 1.5 very soon.
For detail, you can check here

Printing PDf issue

Most of the print pdf library I ran into requires drawing tables, layouts etc. Which library can simply print the web page in pdf format without requiring too much coding? Any pointers will be greatly appreciated
Free* .Net Tool:
ABC PDF
*ABCpdf is normally priced at $329. However as a special offer we'll give
you a free license key - all you have
to do is link back to our web site...
The best solution that is free that I've located is this:
http://code.google.com/p/wkhtmltopdf/
http://www.rustyparts.com/pdf.php (PHP)
The best non-free solution is here:
http://www.html-to-pdf.net (.NET)
http://www.corda.com/java-pdf.php (Java)

Programmatically generate InfoPath form template?

Is it possible to programmatically generate an info path 2007 form template (xsn file=form definition) ?
I know that there is no object model for the infopath 2007 form designer, but does anyone know of any third party libraries?
The form view itself is a xsl file so it should be possible. I would have thought that its a common use case also.
It is possible to generate the manifest.xsf, xsl and xml files from a structured source (let's say an xml) and then pack this (as .cab) with the extension .xsn
(The .xsn file is nothing but a renemed .cab!)
This is only a raw concept - it could be refined if the purpose was a bit more explicit. Why generate? Are you going to create a bunch of different files? What for?
There are no libraries or API's to do this. While generating a template is possible you will need to write it all yourself. Obviously this will not be an easy task and will be prone to errors. I would recommend reviewing your requirements to ensure this is truly necessary. InfoPath is quite flexible, without knowing the details of your project, there is a good chance you can get the functionality you need with a single template.

Resources