DTD with RFDa and XHTML 1.0 Transitional support - xhtml

Is there a W3C document type available with both XHTML 1.0 transitional support and RDFa support?
I am aware of the XHTML+RDFa 1.0 (http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd) DTD but that is XHTML 1.1 which is not compatible with my current website. It seems that there is also a HTML4+RDFa 1.0 (http://www.w3.org/MarkUp/DTD/html4-rdfa-1.dtd) DTD available.
My main reason for not serving XHTML 1.1 is Internet Explorer ofcourse, although I could probably serve it as text/html for the IE users if there is no other way.

I don't believe there is such a DTD, no.
HTML is becoming more and more fragmented and creating DTDs for every possible combination is not going to be practical. HTML5 is not SGML based and effectively gives up on DTDs. In future, validators are probably going to have to change so that they call in component collections like RDFa and ARIA, and indicate in their results which collections are required to allow each particular piece of mark-up to be conforming.
Unfortunately, we're really at about the transition point currently, so there's no clean solution to your problem. However, a certain amount of pragmatism will get you a long way.
Now, XHTML+RDFa 1.0 may be defined by DTD as an extension of XHTML 1.1, but it isn't actually XHTML 1.1, nor is it XHTML 1.0 transitional or strict, or indeed anything other than "XHTML+RDFa 1.0".
So you can take a pragmatic serving approach. Consider the HTML5 attitude to this. It says that anything you serve as text/html is an HTML serialization of the object model, regardless of any DOCTYPE that you declare. This is in practice what browsers do anyway.
Similarly, anything you serve with an XML content type such as application/xhtml+xml is an XML serialization. Those parts of the XML that have the xhtml namespace constitute XHTML.
So, in practice, you can serve your XHTML+RDFa 1.0 as text/html or application/xhtml+xml without any difficulty, provided that the mark-up meets the requirements for polyglot documents.
That leaves the validation. Leaving aside RDFa, is there any mark-up that you're using that's conformant XHTML 1.0 Transitional but not conformant XHTML 1.1? If so, do you care enough about perfect validation to either change these, or to back away from using RDFa? Presumably you're using RDFa for your users benefit, while validation is essentially a convenience tool for yourself.
I faced a similar situation recently, when I decided to add ARIA attributes to my XHTML 1.0 pages. I decided that Accessibility trumps Validity, and I would add the attributes and forget about ensuring my pages were 100% valid.

In reality unless you are concerned with the DTD implementation there are almost no differences between the various XHTML versions and there's almost never a valid reason not to use XHTML 1.1. In what way is your website dependent upon 1.0 transitional? If you can get your site valid with 1.0 strict then moving from that to 1.1 shouldn't cause problems because essentially the only difference is the modularized DTD, which really has no drawbacks to it.
If you're still struggling with the philosophical problem of XHTML MIME I wouldn't worry because conceptually serving the wrong media type with the better organized 1.1 DTD is no more of a crime than with 1.0. The reason RDFa is implemented as it is is because adding the RDFa module to the XHTML 1.1 DTD only involves adding a few lines to the main module. Doing that to the 1.0 DTD would be harder and not as clean.
Some other things to consider are that the XHTML 1.1 second edition spec includes an XML schema implementation. Also, the latest XHTML+RDFa 1.1 working draft finally drops the (stupid) requirement for specifying a doctype altogether, so you could use schema-only validation. This would work out really well if you can figure out a way to use XML/XHTML mimetypes because no doctype declaration is required in order to get standards mode rendering on browsers which support it (all of them but IE8 and below).

Related

What is "Extensible" about XHTML?

Why is XHTML called "eXtensible" (the X in XHTML)? Can we, as individual web developers actually extend it?
What separates it from ordinary HTML?
Well, firstly, things have moved on somewhat, and XHTML isn't really a thing anymore. HTML5 isn't parsed as XML, and XHTML 2.0 was of course cancelled.
Despite that, it's possible to use XHTML if you use the application/xhtml+xml mimetype, just be aware of the various shortcomings of that (any error = yellow screen of death, older IEs don't render anything at all).
For a new project, use the HTML5 doctype and serve as text/html. XHTML can be considered as a failure for many reasons.
Anyway, with XHTML you can do things like this:
<!DOCTYPE html SYSTEM "http://example.com/my-xhtml-custom.dtd">
<html xmlns='http://www.w3.org/1999/xhtml' xmlns:custom="http://example.com/" xml:lang='en-US'>
then copy http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd and edit it how you like, and put it where we referenced earlier.
The w3c have a lot to say about this, specifically:
Don't do this! Documents need to have a meaning as well as correct
syntax. SGML and XML only define syntax. HTML and XHTML define
meaning. If you add elements that aren't defined by a standard, only
you yourself know what they mean. And in 20 or 50 years, even you may
not know it anymore…
Of course, you can experiment, for example to work on future Web
formats, but other than that you should not use proprietary elements.
Nowadays, we thankfully have HTML5 which dropped all this XML stuff (no one was using it and it adds a lot of complexity). It's not extensible in the same way, but that's probably a good thing!

what is the point of being XHTML compliant?

All modern browsers understand HTML, so what is the point of being XHTML compliant other then writing more characters found on the far right side of the keyboard.
There is no point that I can think of. The W3C has canceled XHTML 2.0, although there is supposed to be an XHTML5, which I guess is HTML5 for masochists. Originally XHTML was going to lead us into the world of "correct" HTML documents, but it generated as many (or more) problems than it ever solved.
We validate against either HTML 4.01 Transitional or HTML5 (to the degree that you can do that). That plus clean CSS gives you about the best you can shoot for.
XHTML was originally supposed to be a "next generation of HTML", as well as a stricter version of HTML (which would cause failures if any error showed up in the page). Due to a variety of loopholes and any number of other issues with XHTML (such as pages serving up the wrong mimetype), hardly any pages are actually XHTML, they're just HTML with some extra characters.
Eventually, HTML5 was proposed, w3c split into two groups, then the people working on XHTML 2.0 switched to something better (HTML5) and now everyone is talking about HTML5 taking over everything.
For a longer version (with far more detail), check out this chapter from Dive Into HTML5: http://diveintohtml5.ep.io/past.html
According to http://www.dev-archive.net/articles/xhtml.html, one of the reasons XHTML was created was:
to add the XML ability to extend the language through namespaces. This will make it possible for an author to express more structures and richer semantics than is possible with HTML today. In effect XHTML inherits the possibility of supporting more than one language — instead of extending HTML in a monolithic fashion, XHTML can be extended through modules, where each module define a specific subset of the language.This, theoretically, means extension of the language can be done without the need for a browser upgrade.
XHTML is meant to make the use of XML–based languages in end–user applications such as browsers easy, but can also be used for various data processing and storage purposes in situations where the web is only one of several channels. XHTML take advantage of the extensibility of XML to support multiple namespaces and through them languages.
That article also notes that for most people this won't be useful:
Recommendations
If you don’t have any specific need to deliver XML–based structures to the client, e.g. due to mixing namespaces such as having MathML content in your pages, using Ruby (XHTML 1.1) or techniques such as ACCESS (XHTML 1.2) then consider whether you won’t be better off simply by using HTML 4.01 Strict.
Edit with additional thoughts:
I forgot to mention the point I popped in here to bring up too - XHTML can be more easily manipulated into other languages using XSL transforms.

XHTML still harmful?

I'm starting a project where the client has mandated the use of XHTML 1.0 Strict. Now I'm wondering whether the problems described in Sending XHTML as text/html Considered Harmful are still current and whether I should try to convince the client that this (very strongly stated) requirement is counterproductive.
Does Internet explorer handle application/xhtml+xml correctly by now?
IE9 handles application/xhtml+xml, including SVG inside it, one of the main reasons to want to use this media type. (Otherwise, there's relatively little point in using it to date, as you get a bunch of scripting changes, and IE<9 incompatibility, in return for relatively little if any performance gain at the moment.)
I don't agree with Hixie that serving XHTML as text/html has ever been really harmful. Using the HTML-compatibility guidelines, XHTML poses no problems to any browsers since the ancient Netscape 4. Although it doesn't really get you anything on the client-side, it can be helpful to your own page handling workflow if you're working with XML processing tools. And the XML syntax rules, being stricter-but-simpler than HTML, are a good thing to author to; this gives the validator a chance to pick up on errors that are valid constructs in SGML/HTML but which are almost certainly not what you meant. (On the other hand, since the validator won't enforce HTML-compatibility guidelines there are a couple of places where it can let through well-formed-but-troublesome markup, most commonly self-closed <script> tags breaking the whole page.)
Specifically, to answer his points: /> and related SGML issues are only a problem to tools that really believe HTML is SGML—which is no browser ever, in the past. In the future, it is specifically allowed in non-XML HTML5.
Hiding scripts/stylesheets from ‘legacy’ (pre-HTML 3.2!) browsers hasn't been an issue for a decade or so: I came up with the mangled comment hack he (rightly) derides as ridiculous, but it was only an exercise; I never intended anyone to use it except in some strange hypothetical emergency. It's certainly not ‘necessary’ for using embedded scripts and stylesheets in XHTML-as-HTML... a straight //<![CDATA[ hack is enough if you need to be able to include < and & characters, and more commonly you don't even need that.
No-one actually wants to sniff for XHTML-as-HTML and treat it differently, so that whole section is moot. “Sending XHTML 1.1 as text/html is NEVER fine” has been changed by W3C (it now is fine after all), and XHTML 2.0 is dead.
So yes, use XHTML 1.0 Strict, or XHTML 1.1 or XHTML5, if you like. But until IE9 is your baseline browser (and that's not going to be the case for ages), you'll have to stick with text/html.
Internet Explorer 9 will handle application/xhtml+xml documents through a tag soup parser.
Internet Explorer 8 and earlier will prompt the user to save the document or open it in another application.
Internet Explorer 6 and newer all have significant market share (although this does depend, to some degree, on your market).
Nothing significant has changed as regards browser support for real XHTML for many years.
It is still far more trouble than it is worth unless you actually use XML parsers in your production chain (in which case, good luck persuading them to output XHTML that meets the HTML Compatibility Guidelines).
This depends on what you mean by "Internet Explorer".
For instance, IE6 is still from something like 2001 (that hasn't changed), and no, it still doesn't handle it correctly.
Over the past one year, (27th May 2017 - 27th April 2018), the combined share of IE 6, 7, and 8 comprises 1.72% according to netmarketshare.
Every other major browser supports real XHTML (i.e. sent with the application/xhtml+xml MIME type. My answer to you is "No, it's not harmful".
Whether it's advantageous, I would guess it doesn't matter much until you actually grok and use XML technologies (SVG, MathML, etc) on the web comfortably (yes HTML syntax also supports them, but it's virtually a hack).
If browser makers put more effort into XML parsers, it could still matter for pure parsing speed.

What DOCTYPE should I target today?

I'm refactoring a .Net web application that is in
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >
Right now the approach is just to aim for the stars and go for the latest doctype just because it's latest, I would like to make a wiser choice and target a specific one and for good reasons.
There are similar questions existing but the answers might be outdated now.
What is the difference, advantages, disadvantages between standards and quirks mode, what are some quirks I may run into with differently set doctypes?
I have been told that an XHTML doctype is preferable to integrate AJAX since the UpadtePanel serializes it and to do so needs to have a XHTML do type, to what extent is this true?
And for browser compatibility, in which direction are browsers going in terms of DOCTYPE, is there a common thrend or do they differ?
HTML5 doctype, which is
<!DOCTYPE html>
XHTML is largely dead as a standard, and never was implemented correctly in most cases.
The new thing is HTML 5.
<!DOCTYPE html> is what you use to specify it. That's it. No DTD name or URL or whatever.
If you're using something that likes XML, like .net, then you might want to use XHTML. But don't do it for any other reason; XHTML never was really popular as a standard, or at least it was almost never used correctly.
Any Doctype:
HTML 4.01 or XHTML 1.0
Strict or Transitional
served as html (not html+xml) should be OK. There's no such thing as a better doctype, you just have to choose one filling your needs and then stick to its rules.
Avoid Frameset, but if you've to, use the title attribute to describe the role of each frame to a screen reader user (same with iframe btw).
Quirks mode (no Doctype) is a PITA, avoid it at all cost. This was OK 8 years ago.
No XML prologue unless you're serving html+xml (good luck with that! If you like complicated things when it's not needed, that's your choice)
If you are forced to use attributes that are forbidden in Strict mode (target="_blank" for example) than use Transitional mode: this is why it was created! And please indicate to your users that the link will open in a new page, whether in the text of your link or in its title. This is important from an accessibility point of view.
HTML 5 is the next big thing, we're waiting for it but as long as it won't work in every browser (I mean IE without JS) it's not advisable to use it in "serious" public sites. Is it even a Draft? What if entire part of it are rewritten in a couple of months?
My web agency uses it for its website but we won't use it on a client site anytime soon: it's just too soon.
Sidenote: I often see catch phrases like "a modern website in HTML5 and CSS3" implying that CSS3 is made for HTML 5. CSS3 has nothing to do with HTML5 and can already be used, as long as it degrades gracefully on old browsers.
You can design HTML5 with CSS2.1 or HTML4.01 Transitional with the latest CSS3 animations that only work in webkit nightlies, no problem.
Whatever you choose, make sure your MIME-Type is compatible with your DOCTYPE
The browser will use the MIME-Type (the HTTP Header ContentType) to determine how to treat your page. For example: A DOCTYPE of XHTML 1.1 Strict served as ContentType Text\HTML is parsed as HTML.
DOCTYPE is important, but largely irrelevant if the wrong ContentType is used.
Browsers have never actually used DOCTYPE to determine the markup language of your document (they use HTTP Content-type instead), so which DOCTYPE you chose was never hugely relevant - just as long as you are using a valid DOCTYPE of some description. Whichever you choose is up to you.
If you're writing HTML, <!DOCTYPE html> is the shortest to type, and puts all browsers into standards mode (which is what you want).
If you're writing XHTML, <!DOCTYPE html> is also perfectly legitimate (XHTML actually requires no DOCTYPE at all, as it relies entirely on HTTP Content-type, but there's no harm putting a DOCTYPE in for portability.
Don't use <!doctype html> - while this is technically valid HTML, it's invalid XHTML so will break if you ever try to parse your page as XML.
Slightly OT sidenote: Some people here have commented that XHTML is a "dead" standard - this is false. XHTML has been integrated into the upcoming HTML5 spec. The spec is entitled "HTML5: A vocabulary and associated APIs for HTML and XHTML"
See:
http://www.w3.org/TR/html5/the-xhtml-syntax.html
http://html5doctor.com/html-5-xml-xhtml-5/

What is the role of xhtml in Html5?

I am reading and hearing conflicting information on this subject. w3c closed XHTML 2.0 working group and asking us to look at XHTML 5 coming out of HTML5. How is this different from XHTML 1.0 or 1.1?
XHTML5 is defined by means of abstract tree-like elements (i.e. by DOM), unlike previous HTML versions, that were defined by tags, which were tied to SGML representation.
By using abstract elements, document tree can have several representations. HTML5 defines two standard serializations: SGML-like (technically not based on SGML) HTML5 and XML-based XHTML5. You could even invent your own serialization format, for example JSON-based.
XHTML5 is semantically equivalent to HTML5 (i.e. have the same sets of elements, attributes and nesting rules), but expressed in different syntax. It is even possible to construct document that conforms to both HTML5 and XHTML5.
HTML5 is not part of SGML but XHTML is prt of XML which is part of SGML. So you can have empty tags within HTML5 but not within XHTML.
You can extend XHTML with any XML structure as long as you provide a DTD for that format. in HTML5 there are only some extesions like SVG, MathML which you can use.
I really liked XHTML because it is like XML but HTML5 has a lot more to offer beside other XML formats. Just google a bit what Google, Mozilla, YouTube etc. has to offer with HTML5 and how much you can do with pure HTML5+CSS3 and without the need of JavaScript.
HTML5 has the option to be XML-compliant or not, as far as requiring strict entities, closing tags, etc. So if a strict XHTML format is important to you (as it is to me), HTML5 allows for that. But the flip side of this is that when consuming HTML5 documents from other services, you cannot necessarily rely on them to be strict like you could with XHTML.
The best we can do is encourage others to follow the stricter format by adhering to it ourselves and evangelizing it. IMO it helps far more than it hurts.
HTML5 is HTML 4 with some tags added, some tags taken away, a different doctype, and generally, a bunch of new stuff.
XHTML 1 is just HTML 4 with XML-style syntax.
XHTML5 is just HTML5 with XML-style syntax.
(I may be glossing over some XML details there.)

Resources