What is "Extensible" about XHTML? - xhtml

Why is XHTML called "eXtensible" (the X in XHTML)? Can we, as individual web developers actually extend it?
What separates it from ordinary HTML?

Well, firstly, things have moved on somewhat, and XHTML isn't really a thing anymore. HTML5 isn't parsed as XML, and XHTML 2.0 was of course cancelled.
Despite that, it's possible to use XHTML if you use the application/xhtml+xml mimetype, just be aware of the various shortcomings of that (any error = yellow screen of death, older IEs don't render anything at all).
For a new project, use the HTML5 doctype and serve as text/html. XHTML can be considered as a failure for many reasons.
Anyway, with XHTML you can do things like this:
<!DOCTYPE html SYSTEM "http://example.com/my-xhtml-custom.dtd">
<html xmlns='http://www.w3.org/1999/xhtml' xmlns:custom="http://example.com/" xml:lang='en-US'>
then copy http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd and edit it how you like, and put it where we referenced earlier.
The w3c have a lot to say about this, specifically:
Don't do this! Documents need to have a meaning as well as correct
syntax. SGML and XML only define syntax. HTML and XHTML define
meaning. If you add elements that aren't defined by a standard, only
you yourself know what they mean. And in 20 or 50 years, even you may
not know it anymore…
Of course, you can experiment, for example to work on future Web
formats, but other than that you should not use proprietary elements.
Nowadays, we thankfully have HTML5 which dropped all this XML stuff (no one was using it and it adds a lot of complexity). It's not extensible in the same way, but that's probably a good thing!

Related

what is the point of being XHTML compliant?

All modern browsers understand HTML, so what is the point of being XHTML compliant other then writing more characters found on the far right side of the keyboard.
There is no point that I can think of. The W3C has canceled XHTML 2.0, although there is supposed to be an XHTML5, which I guess is HTML5 for masochists. Originally XHTML was going to lead us into the world of "correct" HTML documents, but it generated as many (or more) problems than it ever solved.
We validate against either HTML 4.01 Transitional or HTML5 (to the degree that you can do that). That plus clean CSS gives you about the best you can shoot for.
XHTML was originally supposed to be a "next generation of HTML", as well as a stricter version of HTML (which would cause failures if any error showed up in the page). Due to a variety of loopholes and any number of other issues with XHTML (such as pages serving up the wrong mimetype), hardly any pages are actually XHTML, they're just HTML with some extra characters.
Eventually, HTML5 was proposed, w3c split into two groups, then the people working on XHTML 2.0 switched to something better (HTML5) and now everyone is talking about HTML5 taking over everything.
For a longer version (with far more detail), check out this chapter from Dive Into HTML5: http://diveintohtml5.ep.io/past.html
According to http://www.dev-archive.net/articles/xhtml.html, one of the reasons XHTML was created was:
to add the XML ability to extend the language through namespaces. This will make it possible for an author to express more structures and richer semantics than is possible with HTML today. In effect XHTML inherits the possibility of supporting more than one language — instead of extending HTML in a monolithic fashion, XHTML can be extended through modules, where each module define a specific subset of the language.This, theoretically, means extension of the language can be done without the need for a browser upgrade.
XHTML is meant to make the use of XML–based languages in end–user applications such as browsers easy, but can also be used for various data processing and storage purposes in situations where the web is only one of several channels. XHTML take advantage of the extensibility of XML to support multiple namespaces and through them languages.
That article also notes that for most people this won't be useful:
Recommendations
If you don’t have any specific need to deliver XML–based structures to the client, e.g. due to mixing namespaces such as having MathML content in your pages, using Ruby (XHTML 1.1) or techniques such as ACCESS (XHTML 1.2) then consider whether you won’t be better off simply by using HTML 4.01 Strict.
Edit with additional thoughts:
I forgot to mention the point I popped in here to bring up too - XHTML can be more easily manipulated into other languages using XSL transforms.

Which standard (HTML/XHTML ) to learn to be ready to use HTML5 when it happens?

I am really new to this so please forgive the basicness of my question...
I want to learn to design websites and I have a program which I am planning to learn (Dreamweaver CS5) using tutorials from Lynda.com. However on the tutorial it says you should have a good grasp of HTML and CSS before starting Dreamweaver.
I looked at the Lynda.com video for HTML but it is all focused on XHTML. http://www.lynda.com/tutorial/47603
Now I am a bit confused. I heard a new standard was coming in (HTML5). If I learn XHTML - does that mean that I will then have to go back at a later date and learn HTML4 so that I can then catch up and learn HTML5 or will I be able to use my XHTML knowledge and add the future HTML5 code to it?
For example there is a Lynda video on HTML5 but the author says you need a knowledge of html before you can watch it.
Do you think the Lynda.com video on XHTML/HTML is a good place to start or do I need to get a book on HTML4 instead?
If you were starting out now would you learn HTML4 or XHTML?
Thanks
XHTML, absolutely.
Last recommended HTML version was 4.x, and it's from 90s era.
Learn XHTML as much as possible, and try to use strict versions.
I agree with #Matías, if only because of it's strictness which will likely result in cleaner code in the long run. That said, porting from one html version to another shouldn't be too difficult regardless of which one you choose.
I find that when programming the use of XHTML is nice because it allows me to catch errors in my markup at compile time instead of some obscure bug showing itself way later when I modify a page.
The whole lack of XHTML 1.1 support in IE has been a pain, but there are work arounds such as XSL transformations and the such. IE9 has finally added support.
Once (X)HTML5 support becomes strong in the major browsers I intend on using XHTML5 in any web projects I do for work. Supporting legacy IE versions will still be a pain, but it will be manageable.
I would learn HTML4.01, but only because I detest XHTML.
It doesn't matter that much, making the port from (X)HTML x.xx to (X)HTML y.yy is not that hard. You'll have a few pitfalls, but that's all.
On the other hand, HTML5 is quite different and you can start learning it already. It's already happening.
Whatever you learn, make sure you learn the Strict version.
Check this out for future proofing: http://blog.twostepmedia.co.uk/css3-still-novelty-or-usable-in-everyday-web-development/
To the O/P, learn the basics of HTML4 and then get straight onto HTML5, you'll be way ahead of the pack and your websites WILL stand out :)
I would personally work on learning HTML5. By the time you get proficient at it to be good enough to professionally code websites, most of the major browser vendors will have adopted it as the standard.
Remember, web technology moves fast! What's hot today will be obsolete tomorrow, and what's in beta now will be hot tomorrow.
I found this http://headjs.com, a modernizer, here on Stack Overflow, which is used to future-proof web applications. This makes learning and using HTML5 markup a possibility today, so that as browser vendors update their applications, they'll slide right into the HTML5 functionality.
Make CSS apply only for Opera 11?
For a brief summary:
HTML 4.01 is the current standard of markup languages for the internet.
XHTML 1.0 was forked off from HTML 4.01. It introduced greater strictness in validation, more XML-like syntax (eg. <br /> instead of <br>) and XML namespaces for things like MathML (for embedding mathematical equations in pages.... very infrequently used). In theory XHTML allowed people to define their own tags.... but in practice this never happened. In actuality, the only real different it has from HTML 4.01 are the self-closing tags, a different doctype (the header at the top of HTML documents), and a few attributes on the <html> tag.
XHTML 1.1 was a natural progression from XHTML 1.0. It introduced even greater strictness, and enforced things like mime-types for served documents. However, because it declared it was XML instead of HTML, and had to be served to the browser as XML (which Internet Explorer to this day does not support), it never took off.
XHTML 2.0 was a draft recommendation that got scrapped along the way. No-one subsequently uses it.
HTML 5 is the next evolution from HTML 4.01. It adds a lot of new tags, new functionality such as local storage (meaning more web-app type applications are possible), and some other goodies. It comes in two flavours - HTML 5, which uses HTML-style syntax, and XHTML 5, which uses XHTML syntax with self-closing tags (and is not to be confused with XHTML 2, which is dead remember.) It is 'the next big thing' in web markup languages, but is still in draft stage. Some browsers are introducing support for new HTML 5 tags, but legacy browsers have no support.
HTML 5 cannot be safely used in current sites, due to the draft nature of the specification. Some sites are doing so, but those sites can possibly get the whole nature of the language yanked out from under their feet.
HTML 5 is not expected to be a formal recommendation until 2022.
In summary: The current language of the web is HTML 4.01. HTML 5 expands on that greatly, but is not ready for everyday use. And the differences between HTML 4.01 and any flavour of XML, are minimal at best.
XHTML's main benefit, as Matias said, is it's XML compatibility, and also the other way round; I regularly use an XSLT to transform an XML document into XHTML. Although XSLT can output HTML, it's HTML that's compliant with XML anyway.
Strictly speaking, there's no reason you can't write HTML5 that's totally XML compliant; for that reason alone, I'd say go with HTML5, and by writing it so that it IS XML compliant, you also get all the benefits of XHTML.

What DOCTYPE should I target today?

I'm refactoring a .Net web application that is in
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >
Right now the approach is just to aim for the stars and go for the latest doctype just because it's latest, I would like to make a wiser choice and target a specific one and for good reasons.
There are similar questions existing but the answers might be outdated now.
What is the difference, advantages, disadvantages between standards and quirks mode, what are some quirks I may run into with differently set doctypes?
I have been told that an XHTML doctype is preferable to integrate AJAX since the UpadtePanel serializes it and to do so needs to have a XHTML do type, to what extent is this true?
And for browser compatibility, in which direction are browsers going in terms of DOCTYPE, is there a common thrend or do they differ?
HTML5 doctype, which is
<!DOCTYPE html>
XHTML is largely dead as a standard, and never was implemented correctly in most cases.
The new thing is HTML 5.
<!DOCTYPE html> is what you use to specify it. That's it. No DTD name or URL or whatever.
If you're using something that likes XML, like .net, then you might want to use XHTML. But don't do it for any other reason; XHTML never was really popular as a standard, or at least it was almost never used correctly.
Any Doctype:
HTML 4.01 or XHTML 1.0
Strict or Transitional
served as html (not html+xml) should be OK. There's no such thing as a better doctype, you just have to choose one filling your needs and then stick to its rules.
Avoid Frameset, but if you've to, use the title attribute to describe the role of each frame to a screen reader user (same with iframe btw).
Quirks mode (no Doctype) is a PITA, avoid it at all cost. This was OK 8 years ago.
No XML prologue unless you're serving html+xml (good luck with that! If you like complicated things when it's not needed, that's your choice)
If you are forced to use attributes that are forbidden in Strict mode (target="_blank" for example) than use Transitional mode: this is why it was created! And please indicate to your users that the link will open in a new page, whether in the text of your link or in its title. This is important from an accessibility point of view.
HTML 5 is the next big thing, we're waiting for it but as long as it won't work in every browser (I mean IE without JS) it's not advisable to use it in "serious" public sites. Is it even a Draft? What if entire part of it are rewritten in a couple of months?
My web agency uses it for its website but we won't use it on a client site anytime soon: it's just too soon.
Sidenote: I often see catch phrases like "a modern website in HTML5 and CSS3" implying that CSS3 is made for HTML 5. CSS3 has nothing to do with HTML5 and can already be used, as long as it degrades gracefully on old browsers.
You can design HTML5 with CSS2.1 or HTML4.01 Transitional with the latest CSS3 animations that only work in webkit nightlies, no problem.
Whatever you choose, make sure your MIME-Type is compatible with your DOCTYPE
The browser will use the MIME-Type (the HTTP Header ContentType) to determine how to treat your page. For example: A DOCTYPE of XHTML 1.1 Strict served as ContentType Text\HTML is parsed as HTML.
DOCTYPE is important, but largely irrelevant if the wrong ContentType is used.
Browsers have never actually used DOCTYPE to determine the markup language of your document (they use HTTP Content-type instead), so which DOCTYPE you chose was never hugely relevant - just as long as you are using a valid DOCTYPE of some description. Whichever you choose is up to you.
If you're writing HTML, <!DOCTYPE html> is the shortest to type, and puts all browsers into standards mode (which is what you want).
If you're writing XHTML, <!DOCTYPE html> is also perfectly legitimate (XHTML actually requires no DOCTYPE at all, as it relies entirely on HTTP Content-type, but there's no harm putting a DOCTYPE in for portability.
Don't use <!doctype html> - while this is technically valid HTML, it's invalid XHTML so will break if you ever try to parse your page as XML.
Slightly OT sidenote: Some people here have commented that XHTML is a "dead" standard - this is false. XHTML has been integrated into the upcoming HTML5 spec. The spec is entitled "HTML5: A vocabulary and associated APIs for HTML and XHTML"
See:
http://www.w3.org/TR/html5/the-xhtml-syntax.html
http://html5doctor.com/html-5-xml-xhtml-5/

What is the role of xhtml in Html5?

I am reading and hearing conflicting information on this subject. w3c closed XHTML 2.0 working group and asking us to look at XHTML 5 coming out of HTML5. How is this different from XHTML 1.0 or 1.1?
XHTML5 is defined by means of abstract tree-like elements (i.e. by DOM), unlike previous HTML versions, that were defined by tags, which were tied to SGML representation.
By using abstract elements, document tree can have several representations. HTML5 defines two standard serializations: SGML-like (technically not based on SGML) HTML5 and XML-based XHTML5. You could even invent your own serialization format, for example JSON-based.
XHTML5 is semantically equivalent to HTML5 (i.e. have the same sets of elements, attributes and nesting rules), but expressed in different syntax. It is even possible to construct document that conforms to both HTML5 and XHTML5.
HTML5 is not part of SGML but XHTML is prt of XML which is part of SGML. So you can have empty tags within HTML5 but not within XHTML.
You can extend XHTML with any XML structure as long as you provide a DTD for that format. in HTML5 there are only some extesions like SVG, MathML which you can use.
I really liked XHTML because it is like XML but HTML5 has a lot more to offer beside other XML formats. Just google a bit what Google, Mozilla, YouTube etc. has to offer with HTML5 and how much you can do with pure HTML5+CSS3 and without the need of JavaScript.
HTML5 has the option to be XML-compliant or not, as far as requiring strict entities, closing tags, etc. So if a strict XHTML format is important to you (as it is to me), HTML5 allows for that. But the flip side of this is that when consuming HTML5 documents from other services, you cannot necessarily rely on them to be strict like you could with XHTML.
The best we can do is encourage others to follow the stricter format by adhering to it ourselves and evangelizing it. IMO it helps far more than it hurts.
HTML5 is HTML 4 with some tags added, some tags taken away, a different doctype, and generally, a bunch of new stuff.
XHTML 1 is just HTML 4 with XML-style syntax.
XHTML5 is just HTML5 with XML-style syntax.
(I may be glossing over some XML details there.)

DTD with RFDa and XHTML 1.0 Transitional support

Is there a W3C document type available with both XHTML 1.0 transitional support and RDFa support?
I am aware of the XHTML+RDFa 1.0 (http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd) DTD but that is XHTML 1.1 which is not compatible with my current website. It seems that there is also a HTML4+RDFa 1.0 (http://www.w3.org/MarkUp/DTD/html4-rdfa-1.dtd) DTD available.
My main reason for not serving XHTML 1.1 is Internet Explorer ofcourse, although I could probably serve it as text/html for the IE users if there is no other way.
I don't believe there is such a DTD, no.
HTML is becoming more and more fragmented and creating DTDs for every possible combination is not going to be practical. HTML5 is not SGML based and effectively gives up on DTDs. In future, validators are probably going to have to change so that they call in component collections like RDFa and ARIA, and indicate in their results which collections are required to allow each particular piece of mark-up to be conforming.
Unfortunately, we're really at about the transition point currently, so there's no clean solution to your problem. However, a certain amount of pragmatism will get you a long way.
Now, XHTML+RDFa 1.0 may be defined by DTD as an extension of XHTML 1.1, but it isn't actually XHTML 1.1, nor is it XHTML 1.0 transitional or strict, or indeed anything other than "XHTML+RDFa 1.0".
So you can take a pragmatic serving approach. Consider the HTML5 attitude to this. It says that anything you serve as text/html is an HTML serialization of the object model, regardless of any DOCTYPE that you declare. This is in practice what browsers do anyway.
Similarly, anything you serve with an XML content type such as application/xhtml+xml is an XML serialization. Those parts of the XML that have the xhtml namespace constitute XHTML.
So, in practice, you can serve your XHTML+RDFa 1.0 as text/html or application/xhtml+xml without any difficulty, provided that the mark-up meets the requirements for polyglot documents.
That leaves the validation. Leaving aside RDFa, is there any mark-up that you're using that's conformant XHTML 1.0 Transitional but not conformant XHTML 1.1? If so, do you care enough about perfect validation to either change these, or to back away from using RDFa? Presumably you're using RDFa for your users benefit, while validation is essentially a convenience tool for yourself.
I faced a similar situation recently, when I decided to add ARIA attributes to my XHTML 1.0 pages. I decided that Accessibility trumps Validity, and I would add the attributes and forget about ensuring my pages were 100% valid.
In reality unless you are concerned with the DTD implementation there are almost no differences between the various XHTML versions and there's almost never a valid reason not to use XHTML 1.1. In what way is your website dependent upon 1.0 transitional? If you can get your site valid with 1.0 strict then moving from that to 1.1 shouldn't cause problems because essentially the only difference is the modularized DTD, which really has no drawbacks to it.
If you're still struggling with the philosophical problem of XHTML MIME I wouldn't worry because conceptually serving the wrong media type with the better organized 1.1 DTD is no more of a crime than with 1.0. The reason RDFa is implemented as it is is because adding the RDFa module to the XHTML 1.1 DTD only involves adding a few lines to the main module. Doing that to the 1.0 DTD would be harder and not as clean.
Some other things to consider are that the XHTML 1.1 second edition spec includes an XML schema implementation. Also, the latest XHTML+RDFa 1.1 working draft finally drops the (stupid) requirement for specifying a doctype altogether, so you could use schema-only validation. This would work out really well if you can figure out a way to use XML/XHTML mimetypes because no doctype declaration is required in order to get standards mode rendering on browsers which support it (all of them but IE8 and below).

Resources