I downloaded xnview two years ago, and added title/subject/keywords metadata for a bunch of jpegs. However recently I compared the keywords from xnview and adobe bridge. I realize that they are different. I believe the title/subject is not the same. I noticed this, because some stock photo agencies do not read the metadata from xnview. They usually leave the title field blank. It looks like xnview uses xmp format whereas adobe uses iptc. Any suggestions on how I can batch convert
Does Adobe software support IPTC metadata these days? They are authors of XMP which is a competing standard, so I wonder. Also, see https://en.wikipedia.org/wiki/Extensible_Metadata_Platform#Support_and_acceptance for a list of software with which you can check actual content of the metadata for the particular file (I would suggest http://owl.phy.queensu.ca/~phil/exiftool/ as the most versatile option).
Related
In Paw, it was possible to easily share a paw file in a repository as part of the documentation because it was stored in plain text, making changes to it legible.
Artsy is doing it in its energy project and they wrote some blog posts about this.
However, since the 3.x release, it appears that the file is now stored in binary format.
Am I missing a hidden option to save in xml format or is the feature completely gone?
I am working on a learning project for mobile devices that requires (or would at least be desirable) the ability to export to a SCORM-compatible format. I see that SCORM has a "Package Interchange Format" (PIF) based around a .zip file. I am new to SCORM and am trying to understand exactly what this file must contain. Specifically, is the PIF file just a format for generating interchangeable data between systems, or is it more complicated than that?
For some context, imagine the use case of a set of questions/sections that a user has to run through on a native mobile app, and at the end, we want to offer the ability for the user to "export" their data in a SCORM-compliant fashion. Is this simply a matter of exporting information about a) the questions and b) the answers into some .xml format, or is there more to it? I notice a lot of the documentation around SCORM seems to focus on Javascript and HTML. Is SCORM HTML specific, or are native apps reconcilable with SCORM, at least from the export perspective?
Apologies if any of this is basic stuff. Just trying to wrap my head around the standard and how it does or does not apply to what I'm doing.
The PIF is really a very small detail of SCORM's packaging. It only says that you can distribute your content in zip format, but not what that should contain.
What a SCORM (1.2) file should contain is described in much, much detail in the SCORM CAM book. To summarize very quickly, you need:
All the files necessary for the content to run (images, html files, javascript files, css etc)
A file called imsmanifest.xml that describes a few things about your content, the files it contains and possibly how they interract with the LMS they run on. It can vary from very simple to very complicated.
Optionally, metadata in XML format
So, SCORM does not care if and where you include your questions and answers. It doesn't know about them. This is your content's responsibility and that should be able to include them and present them to the user, when ran. What SCORM can do is make your content communicate with the LMS you're running it on, so that the results of these questions are persisted.
For now, I'd suggest that you have a look at some existing SCORM files, to get an idea of how the imsmanifest.xml file should look like, and then study the SCORM CAM book and things will get rolling.
The trouble with SCORM is that is has to be launched from within LMS. If you're building an external app that has to communicate to a LMS, take a look at either LTI (http://www.imsglobal.org/toolsinteroperability2.cfm) or TinCanAPI (http://tincanapi.com/).
SCORM 2004 sample https://github.com/cybercussion/SCOBot/
You zip the contents of the directory. Some LMSs expect the imsmanifest.xml to be located in the root of the zip.
Some people are using Native Apps in a LMS format and loading the SCO's into an HTML view, but as stated above SCORM is expecting a JavaScript to JavaScript communication.
What are some good authoring tools for creating cross-platform help files for end-users? (Our application is using the Qt framework, if that makes any difference.)
Note: I'm not interested in internal API documentation--we're using doxygen for that.
Ideally, a solution would:
Allow us to manage all help content (text, table of contents, images, etc.) in a single location.
Output to native help formats. (CHM for Windows--or at least something we could feed directly into the HTML Help API; not sure what other platforms' "standard" help formats are.)
Decent WYSIWYG support: handle common text entry, images, cross-references, etc. easily, but we can edit the HTML when we need to.
Text-based file-format for help project (XML, etc.) so that it can be versioned in Subversion.
Any hooks that help keep it in synch with the actual code base would be great. (Perhaps somehow a help topic is associated with a code file, and can check Subversion to see if any changes have been made and flag a topic as "possibly out of date" ... am I dreaming?)
Help content can be localized.
Not opposed to commercial product, but a free option would be nice.
I'll go ahead and make this a wiki and start with a few examples. Vote 'em up or down if you have experience with them, and leave some comments. Add additional tools as well.
I just discovered Sphinx; I think I'm in love.
Better than WYSIWYG over HTML: reStructuredText
Outputs to QtHelp (among other things), so will be easily to distribute (and integrate) in our application.
Not sure about localization yet, but we'll cross that bridge when we need to.
Was easy to set up and "just works"; looks professional.
I have used robohelp for years.
It is fine, but the core technology is very old now. Also the way they lock to Word versions is a total PITA (and has forced me to avoid MS office upgrades several times).
We are moving to madcap flare http://www.madcapsoftware.com/products/flare/robohelp.aspx
I think DocBook addresses all you requirements except possibly the synchronisation hooks, which I'll think a bit further on. It's essentially a subset of XML designed for creating documentation, and is free and open source. It's just a format plus a set of XSL output transforms that convert the Docbook into more useful formats (HTML and thus CHM, JavaHelp, PDF via XML-FO or Tex).
This means that you still need to choose an XML authoring tool to actually edit it so things like WYSIWYG will depend on the features of your XML authoring software. We use Syntext Serna as it has good support for WYSIWYG and inline editing of XML #includes (no-one else seems to support the latter). You may find other XML authoring tools better suit your needs - Serna is an reasonably pricey commercial offering.
Docbook provides a lot of flexibility via profiling, which allows you to include/exclude xml elements based on their attributes. Example use cases would be to have slightly different help output for OS=Windows than OS=Linux. Localization is also supported via profiling and other mechanisms.
A fairly good introduction to Docbook can be found here.
We use Docbook for our help format, and compile it to CHM files that contain help only for the features relevant to a specific product (ie Enterprise edition has features that aren't in the Standard or Demo versions). The relevant steps are:
Run the Profiling XSL templates on the XML Source (using eg XSLTproc).
Run the HTML-Help XSL templates on the output of 1.
Compile the output HTML files using Microsoft's HTML Help Compiler (HHC).
Help & Manual
Robohelp
The only one I know is Latex, one of the latex2html converters, and then a few adaptation to make the resulting html ready for the CHM archiver.
text,html,chm,pdf, ps no problem.
Converting to Word via RTF used to be a disaster, don't know current status.
latex 2 html converters, while several, all have their own problems.
The pdfs look absolutely great.
WYSIWYM (via lyx) possible.
This archive has a bunch of CHMs that way (notably the prog,ref and user parts, the rest (rtl,fcl,lcl) are generated by our own doxygen equivalent, fpdoc)
http://www.stack.nl/~marcov/doc-chm.zip
Note that the above CHMs are made with our own (portable) CHM compiler. Yes, no more workshop.
A Lyx document as PDF and html:
pdf: http://www.stack.nl/~marcov/buildfaq.pdf
html: http://www.stack.nl/~marcov/buildfaq/
The last time I produced a catalog I used a software called EasyCatalog that worked with Adobe InDesign to merge data from a spreadsheet with graphics. I wouldn’t say it was completely successful. I know of one other catalog building software called Catalog Builder by Computer Pundits. I'm just looking for any suggestions from someone who might have gone through this process on what software I should use.
InDesign can create a really beautiful output from XML. Depending on the catalog content's complexity, you can either have a straightforward mapping of the elements in your catalog to the paragraph and character styles of the IDD file, or you may need to preprocess the XML with XSLT.
For example, if your data source can output the content as XML, but it doesn't map easily to InDesign tables, XSLT can be used to make the XML more "IDD-friendly" before you import it.
IDML is another way to handle XML content; instead of importing the XML content manually or with a script into your catalog template, you generate the IDML directly from your XML. (IDML files are a package of XML files that describe the page/spreads, fonts, swatches, text, images, etc. of the InDesign file.) You're probably going to need XSLT consulting help if this is not a skill you already have.
Take a look at the InDesign documentation for XML for the version you use. IDML is for CS4 or CS5.
Have a look at xCS.press by a company in Belgium. XML markup that is parsed to InDesign. Great for product catalogs.
I wouldn't use Catalog Builder by Computer Pundits again. I've used it in the past (mostly their website builder) and it is completely outdated in my opinion. Their templates are not easily customized and it was pretty slow for me. As for their website builder,(in case you're wondering) it's all tables and very little css IDs or Classes throughout the html.
I haven't used InDesign, but it seems there are lot of scripting features.
Easiest thing that comes to mind is creating an XML Schema with IDML to
get data from Excel into an InDesign document.
XML schema basically is a template for XML documents, and they're called XML Maps in Microsoft Office.
Am not familiar with catalog tools, try superuser.com as well for 3rd party tools & tips.
John. have you come across additional solutions since posting this? I wonder if you have considered CatBase or EM Software solutions.
InDesign works wonderfully with XML and XSLT. You can export the data from Windows Excel only to XML, but only when you create an XML-compatible worksheet. Don't save the file as an XML spreadsheet, that file is useless in InDesign.
What I do is either create a schema file (xsd) for the data that you want to use and import that into Excel on Windows (Mac version doesn't support XML) Once the schema is imported you can create an XML worksheet based on this schema and then copy and paste the data from the non-XML worksheet into the XML sheet. Once the data is in the spreadsheet you can export to an XML file and import it into InDesign.
As mentioned above you can map XML tags to Paragraph and character styles and create dynamic layout directly in InDesign or by using an XSLT to structure the data before you import it.
MS Access allows you to export directly to XML. If you move your data to InDesign you can save the time needed to build the XML spreadsheet. Image references have to be built properly before you export to XML or build an XSLT that will do it on the fly as you import the data in to your layout.
The entire process is described in detail in the book A Designer's Guide to Adobe InDesign and XML.
If the data is in MS Access Woodwing has a product that allows you to interface and import data for a catalog. I have not used it personally but I know people who have. Also, another product called In-Data also interfaces with InDesign, but I have no experience with that either. I usually just use XML and XSLT myself.
I've used EasyCatalog very successfully for a number of years now, even for really large catalogs (35,000+ articles). In the meantime, I offer EC consulting and hands-on user training as well.
I'd need many more details of what went wrong with your specific catalog in order to be able to point your attention to a different solution that may better fit your needs.
I personally would not recommend Jim Maivalds solution because a) Excel and Unicode are not friends b) working with Excel and XML really is a pain c) the process is relatively complicated d) you need a lot of specialized skills regarding XML, XSLT programming and so on e) it's not bi-directional f) when updating you'll do the whole process again.
With EasyCatalog, you just import your data into a panel and place them from there into your document, from manually up to fully automatically. It's really easy, and it's bi-directional - so you can update your document from the database at any time and - if you need to - your database from your document. By the way, you can import your data directly from Excel into an EasyCatalog panel as well.
However, EasyCatalog might not be the best solution if graphics are included in you spreadsheet as well - but who would ever include the real graphics in a spreadsheet instead of the name (and maybe path) to the actual graphic files?
What my users will do is select a PDF document on their machine, upload it to my website, where I will convert into an HTML document for display on the website. The document will be stored in a database after conversion.
What's the best way to convert a PDF to HTML?
I have been handed a requirement where a user would create a "news" story as a pdf and then would upload it to the sever, where it will be converted to HTML and displayed on the website.
Any document creation software that can save documents as PDF can save them as HTML. I'm assuming the issue is that your users will be creating rich documents (lots of embedded images), which results in multiple files, and your requirements stem from a desire to make uploading these documents as simple as possible to the user.
There are numerous conversion packages that can probably do this for you, however when you're talking about rich content, you are talking about text plus images. Those images have to be stored somewhere and served somehow, and whatever conversion method you use will require you to examine all image sources to make sure they point to valid locations on your server.
I would like to suggest an alternate way of doing this that you can take to your team: Implement one of the many blog APIs for publishing content. There are free and commercial software packages that use these APIs to publish content directly to a website, such as Windows Live Writer and Microsoft Word. Your users can simply create their content and upload it directly to your website without having to publish it as PDF first then upload it. So the process becomes much smoother for your users, and you get the posts in a form that doesn't require you spend thousands of dollars on developing or buying conversion code.
The two most common APIs are the MetaWeblog API and the Movable Type API. Both are very simple and easy to implement. I think this way would be a MUCH better alternative than what you're thinking about doing.
I don't think converting a PDF to an HTML string is necessarily the best idea, especially if you want to export it back as PDF. PDF files often contain binary elements such as images, so you may be best to convert it to ASCII via an encoding, such as Base64. That way you will have an ASCII string you can save into a text field in the DB and then convert it back out. Could you expand more on the main requirement?
My recommendation would be to not do it that way IF POSSIBLE (but we all know what managers are like) so...
I would recommend that you stay away from converting the PDF to/from HTML (because unless you can find a commercial solution it will be nigh on impossible) and instead do as has already been mentioned and store it as an encoded Base64 string, or BLOB or some other binary format in the database, and then display it to the user with some sort of PDF view plugin for the browser.
All it took was a simple google search for "PDF to HTML": http://www.gnostice.com/pdf2manyOverview_x.asp. I'm sure there are others.
So while it's 'possible', you may want to explain to your manager that this isn't the best content management solution.
Why not use the iTextSharp to read the PDF content? Then You could save both the binary PDF and the text content to the database. You could then let users search the content and download the PDF.
You should look into DynamicPDF. They have a converter (currently Beta) out for serving exactly this purpose. We have used their products with great success (especially for dumping Reporting Services reports directly to PDF).
Ref: http://www.dynamicpdf.com/