Using Java to write RDFa - xhtml

I need to automatically generate (from a database) an XHTML document marked up with RDFa or some other microformat, it doesn't matter which one. How can I best do this using Java? I have been using Jena to output RDF/XML but it doesn't do RDFa unfortunately.

The reason that Jena doesn't provide an RDFa writer is that the whole point of RDFa is to be embedded in some other (human-readable) web page. I think your main option is to use something like Velocity or Freemarker to produce the pages with embedded calls out to Jena to get the appropriate RDF statements. You'll have to handle the RDFa encoding yourself. For testing, you could read your web pages back in using an RDFa reader to see if you get back the right set of triples, but really that's only half the story. You also need to test whether the page expresses the user-intent you want by enabling inline metadata, and that's much harder to test.

If you are willing to take another step forward, there are also Grails plugins that provide easy methods to produce RDFa from domain classes in views:
http://grails.org/plugin/rdfa

Related

Rewriting binary links to use CDN

CDN integration seems to be a hot topic among Tridion crowd. But, somehow, available discussions mainly revolve around pushing content to/fro CDN. What i'm specifically interested is:
What will be the proper way of modifying/prefixing inline images outbound links to use CDN?
The simplest way to go would be to create some post-processing TBB, operating on Output item, and place it inside 'Default Finish Actions'. Though, doing this on CD side would seem to be more correct, ain't it so?
EDIT
Consider fancier case: what if not only I want to modify image paths, but wrap the whole image links into ASP.Net controls. Where do I do this?
EDIT 2
So far, implemented tag to ASP.Net control replacement via TBB. Went smooth, only needed to keep an eye on the following subtle matters:
Consider CSS inline styles (i.e.: background-image: url(..))
New TBB needs to be placed after any link-manipulating logic (e.g.: Extract Binaries from Html, Publish Bnaries in Package, Link Resolver)
The quickest and most robust implementation is probably with a simple string replacements (in contrast to regexp's or XML parsing)
To keep standard "Preview" logic intact, some condition is necessary to trigger the logic
If you decide to go with ASP.NET controls for your CDN-hosted images, you may consider these phases/steps:
write a TCDL tag (e.g. <tcdl:image id="..." path="...") on CM during rendering
write a TCDL TagHandler implementation that transforms the TCDL into an ASP.NET include during deployment
write the ASCX control to do the CDN lookup proper when the visitor requests the page
I'm not sure if both step 2 and 3 are needed. You might also simply write the CDN path during the deployment phase (step 2 above).
At the same time I'd expect you to upload (updated) images to the CDN using a deployer extension, so that it also happens during phase 2.

Allowing user-created templates in an ASP.NET site

I have a website I’m converting from Classic ASP to ASP.NET. The old site allowed users to edit the website template to create their own design for their section of the website (think MySpace, only LESS professional.)
I’ve been racking my brain trying to figure out how to do this with .NET. My sites generally use master pages, but obviously that won’t work for end-users.
I’ve tried loading the HTML templates as a regular text file and parsing it to ‘fit around’ the content place holders. It is as ugly as it sounds.
There’s got a be something generally regarded as the best practice here, but I sure can’t find it.
Any suggestions?
How much control do you want your users to have?
There are a few ways to implement this. I'll give you a quick summary of all the ideas I can think of:
Static HTML with predefined fields.
With this approach you get full control and you minimize the risk of any security vulnerabilities. You would store per-user HTML in a database table somewhere. This HTML would have predefined fields with some markup, like using {fieldName}. You would then use a simple parser to identify curly brackets and replace fieldName with a string value pulled from a Dictionary<String,String> somewhere.
This approach can be fast if you're smart with string processing (i.e. using a state-machine parser to find the curley brackets, sending the rebuilt string either directly to the output or into a StringBuilder, etc, and importantly not using String.Replace). The downside is it isn't a real templating system: there's no provision for looping over result-sets (assuming you want to allow that), expression evaluation, but it works for simple "insert this content into this space"-type designs.
Allow users to edit their own ASPX or ASCX files.
Why build your own templating system if you can use ASP.NET's? Well, this approach is the simplest if you want to build a quick 'n' dirty reporting system, but it fails terribly for security. Unfortunately you cannot secure any <% %> / <script runat="server"> code in ASPX files in a sandbox or use CAS owing to how ASP.NET works (I looked into this myself earlier: Code Access Security on a per-view ASP.NET MVC basis ).
You don't need to store the actual ASPX and ASCX files in the website's filesystem, you can store the files in your database using a VirtualPathProvider implementation, but getting it to work right can be a bit of a pain (especially as the ASP.NET runtime compiles ASPX files, so you'd need to inform it if an ASPX file was modified by the user). You also need to be aware that ASPX loading is tied into the user's request path (unless you're using Routing or MVC) so you're better off using ASCX, not that it matters really.
A custom ASP.NET handler that runs in its own CAS sandbox that implements a fully-fledged templating engine
This is the most painful option, and it exists between the two: you get the flexibility of a first-class templating engine (loops, fields, evaluation if necessary) without needing to open your application up to gaping security flaws. The downside is you need to build pretty much everything by yourself. I won't go into detail here.
Which option you go for depends on what your requirements are. MySpace's system was more "skinning" than "templating", in that the users were free to set a stylesheet and define some arbitrary common HTML rather than modify their page's general template directly.
You can easily implement a MySpace-like system in ASP.NET assuming that each skinnable feature is implemented as a Control subclass, just extend your Render method to allow for the insertion of said arbitrary HTML. Adding custom stylesheets is also easy: just add it inside a <style type="text/css"> element in your page's <head>.
When/if you do allow the user to enter HTML directly, I strongly recommend you parse it and filter out any dangerous elements (such as <script>, <style>, <object>, etc). If you want to allow for the embedding of YouTube videos and related then you should analyse <object> elements to ensure they actually are of YouTube videos, extract the video ID, then recreate the element from a known, trusted template. It is important that any custom HTML is "tag-balanced" (you can verify this by passing it through a strict XML parser instead of a more forgiving HTML parser, as XHTML is (technically) a subset of HTML anyway), that way any custom markup won't break the page.
Have fun.

Minimizing the pain in implementing printable reports

How do you minimize the pain in your development process when it comes to reporting?
For web frameworks, there is a pretty straightforward way to both produce content as well as graphically design it; content is represented semantically through HTML, and the design is separately specified through CSS. And browsers are fairly consistent with how they render the output (and the inconsistencies are well-known and can be planned for). There are even WYSIWYG editors to help out less-CSS-savvy graphical designers.
But what do we do about print content?
At one company, I created a process that worked like this: A script generated a semantic representation through XML. The XML was passed through XSLT to generate an XML-FO document. Then, this was passed to another tool (Apache FOP, I believe) to generate a PDF. This worked well for that company.
At this company, however, output appearance matters to management, and we have a graphical designer. Currently, we are using a reporting tool (XtraReports from Developer Express, version 8.1). It isn't bad; it outputs to a variety of formats, has a WYSIWYG designer, reports are implemented through C# classes, and it supports data binding to data sets (unfortunately, not POCO's). However, we have some major pain points with this setup:
The reporting framework has major limitations on how you can lay out and group your reporting bands
Presentable elements, especially charts, lack the capabilities we need to fine-tune and achieve the look of our mock-ups.
There is no good way to share styles and layout among reports akin to what we can get through CSS.
Good composability of reusable parts is very hard to implement. So we end up with a lot of copy & paste inheritance of functionality; this is bad news whenever we need to make sweeping changes across all reports.
Now, maybe there's some kick-ass framework out there that can eliminate the pains of reporting frameworks, but I assume that they all have their weaknesses. Do you have a framework or process that works well for you and reduces the pain points inherent in reporting?
Prince XML is a really cool tool which allows you to use HTML or XML styled with CSS (including CSS paged media for printing) and generate PDFs from it.
Option #1 : Adobe Acrobat is really nice. You can design form enabled PDFs and then use something like PDFSharp to manipulate the PDF document. You can create template PDF's that you dump your generated stuff into. I've done this before and it was pretty successful. I also used POCO objects nicely.
Option #2 : You could start creating XPS documents, which is XML based anyways. And they can be easily converted to PDF if necessary.
Option #3 : Run for your life.(might not be an option)
i-net Clear Reports is a nice product. It's based on Java but you can also work with ASP.NET. There is a bridge. The .NET version is in work if you want work with POCO. Because the Java version can work with POJO that the coming .NET version will also work with POCO.

Should I be using XML + Stylesheets vs. XHTML and CSS?

I have been developing web apps for a while now and for the past year I have been really exploring as many technologies as possible. I know some people are creating pages using XML and XSLT or maybe css style sheets; however, it seems to me that the trends are still not moving in direction. Plus it seems less functional/easy than XHTML/CSS based pages.
What are the benefits of using XML/XSLT, and is it ideal to start developing in that manor? Is there anything else new that is pulling ahead of the pack in regards of front end web development?
The reason I am bringing this stuff up is because it seems that many people are switching from XML as a datasource to JSON, which makes more sense as a datasource; however, XML is still functional as a markup language...
And on that note, why would I even want to use XSLT vs CSS for the XML pages if i were to start develop that way. It seems to me that they serve the same purpose except that XSLT looks like tag soup.
I hope this question makes sense....
XSLT can be useful if you have an XML data source that needs transforming into HTML. Otherwise you should be using HTML, CSS and jQuery for front-end development.
Right now, there is no reason to use XSLT at all. It's virtually incomprehensible compared to XML/XHTML, and offers no real advantage for you or your users.
As for using XML in lieu of (X)HTML, with the growing acceptance of the emerging HTML5 standard, I can't see why you'd give up canvas and the (eventually, they'll be good!) audio capabilities for XML. Even now, XML is nice for marking up documents, but for marking up a webpage, HTML is king – it's essentially XML tailor-made for the web.
There is no antagonism between XML/XSLT and XHTML/CSS, these are complementary technologies. Thus, in my web apps, XHTML pages are produced by mean of XML/XSLT (transformation occurs in client side).
You'd use XSLT to transform some XML document into XHTML. Then you'd use CSS to style the XHTML.
XSLT is for transformation of one XML format into another. The data stays the same, but the representation changes. There is even XSLT-FO, which transforms XML into other objects, like pdf.
Also note, XSLT can be used client-, or serverside. You can do XSLT transformation in the browser or with a simple handler on the server. Java-based nonsql data stores like existdb use XQUERY to transform database entries with XSLT to any other XML format, including XHTML.
Using XSLT to generate XHTML from simple XML documents basically gives you a templating engine.
Since browsers still lack XFORMS support, you can use javascript+XSLT to transform XFORMS into valid HTML.
JSON is used to serialize and deserialize objects and transport them, thus replacing XML as a transport format, more specifically as a AJAX query response, in rich internet applications.

What are the benefits of an XML data model over the DataSet model?

At my current job we have a CMS system that is .NET/SQL Server based. While customizing a couple of the modules for some internal use, I was a little surprised to see that instead of having APIs that returned data via your typical result set that was bound to a DataGrid/DataList/Repeater control, that the APIs returned an XML node/collection, that was then passed to an XSLT transformation and rendered on the page that way.
What are the benefits to using a model like this?
Using XSLT transformations would enable you to use a different layout and formatting than the standard .Net grid controls. Some people don't approve of using the .Net grids because they can include more HTML than necessary, and because if not managed carefully, they can bloat ViewState.
There was a recent discussion here about the .Net grids being bloatware (but developers use them anyway).
The outputted pages can be of any type, like html, php, etc.
By setting up the datasource and xml that the page merely transforms, you have also instantly created a simple 'web service' that can be consumed by other software. For example, it would be trivial to turn that grid into an rss feed or write a program to scrape that data periodically and send a more pressing alert.
The XSLT method is very MVC, unit-testing, separate-concerns friendly where ASP.NET controls well... aren't.
caveat: I reject the assumption that MS can write better html/css/js than I can. ASP.NET controls are clunky abominations.

Resources