IE's XHTML Compatibility - xhtml

I'm having a rather heated debate into IE's XHTML Compatibility. The only thing is, I'm unsure if the guy is trolling.
Essentially he claims that IE has absolutely no XHTML compatibility, and that a document with a defined XHTML doctype means absolutely nothing when served as content type text/html, regardless of the browser used.
I do not believe this and sources say otherwise, but I am wrong?
Edit: Disregarding IE, does it still mean that when XHTML is defined in an HTML document it is NOT XHTML? Like the guy suggested? My current understanding is that XHTML is often contained within the HTML content type. This means that technically you could say that XHTML is merely HTML unless the correct content-type is used. But it's still XHTML syntax and so it a little confusing.
You can find the thread in question over at digitalpoint forums.

IE has indeed no support for application/xhtml+xml content type while other browsers would then handle HTML as XML. When text/html is used, every browser will just handle XHTML as HTML, IE does nothing different here (expect from the usual quirks).
More details here: http://hsivonen.iki.fi/doctype/

The guy is right. When serving XHTML as text/html, is is no longer XHTML, but funny-looking HTML. MIME type is key.

I guess this question is done, but:
IE has absolutely no XHTML compatibility
IE won’t parse XHTML content served as text/html as XHTML. (It’ll parse it as HTML.)
Unfortunately, it won’t display XHTML content served as application/xhtml+xml as a web page — it’ll display it just like it displays any other XML content, i.e. prettified source.
As per the XHTML 1.0 spec, you are allowed to serve it as text/html for compatibility with older browsers (i.e. IE). So IE is sort of compatible with the XHTML 1.0 spec. But as many have argued, if you’re not parsing XHTML as XML, what’s the point?
a document with a defined XHTML doctype means absolutely nothing when served as content type text/html, regardless of the browser used
Depends what he means by “means”. It’s still HTML, so it’s got all the meaning associated with that. And as you say, the content is still XHTML, as it’s written in the XHTML syntax. But it won’t be parsed as XML due to the mimetype, so in that sense it’s not XML, and thus isn’t XHTML.
(If you’re wondering what the practical implications of this are, join the club.)

BalusC is correct. More to the point is that the person you are arguing with is assuming that XHTML must be XML, which is false. XHTML 1.0 is a syntax of HTML that is compatible with XML syntax, but is however an SGML serialization that may optionally be processed as XML per paragraph 5.1 of the specification. XHTML 1.1, however, must be processed as XML.
The idea is that XHTML 1.0 is written as a transitional point between SGML and the XML syntax, even XHTML 1.0 strict. The various doctypes of XHTML 1.0 are all transitional and merely indicate the degree of conformance to the XML syntax without regard for the method of processing.

The other guy is right. You are wrong. IE has no support for application/xhtml+xml as others have said. IE treats it as "tag soup" when served as text/html as do other browsers.

Related

Writing Strict XHTML 1.0

I have an exam, where I will be provided a series of code snippets and asked to determine whether they are Valid or Invalid Strict XHTML1.0. I can't find any rules, or digestible resources online. Can anyone advise if there is a set of checks that I can memorise?
The most immediate thing you can and should do is ensure the file is served as application/xhtml+xml. If you are creating a file and don't have access to server-side scripting then you simply need to create a file with a .xhtml extension and application/xhtml+xml via the developer tools in whichever browser you're using.
I highly recommend using Firefox; when you encounter an XML parsing error the whole page will be hidden, it will have a yellow background and display the error, it's line and column numbers in red text. It's extremely useful for quickly addressing malformed XML parsing errors.
Keep in mind that XHTML 1 (HTML4 equivalent) is outdated and I highly recommend using XHTML5. While I've updated my platform from XHTML 1 Strict to XHTML 5 (link in my profile) you will be exceptionally hard pressed to find better examples of stricter code that will adhere to XHTML5.
Also keep in mind that HTML (text/html) is handled by a browser's HTML parser whereas XHTML (application/xhtml+xml) is handled by a browser's XML parser.
An XML parser will catch malformed XML though it will not prevent duplicate id attributes from wreaking havoc in JavaScript (the first `id attribute of two or more identical values will always be targeted).
It should also be noted that XHTML1 defined attributes to have the same value as the attribute name:
XHTML 1
<select>
<option selected="selected"></option>
</select>
XHTML 5
<select>
<option selected="true"></option>
</select>
XHTML5 defines most though not all (e.g. the autocomplete attribute) as having boolean values (e.g. true or false).
Lastly you can have everything completely XHTML1/5 compliant though if the media type/mime is text/html then your page is not XHTML in any form. One of the greatest advantages of XHTML is that is has to be served strictly; strict code can be dependably served loosely though loose code can not be dependently served strictly and I am not talking about doctypes.

Is a link without the protocol valid XHTML? i.e. <a href="//www.example.com/">

Is it valid XHTML / good practice to have links of the following form?
//www.example.com/foo/bar.html
If the current page is HTTP, then the link points to: http://www.example.com/foo/bar.html
If the current page is secured under HTTPS, then the link points to: https://www.example.com/foo/bar.html
In other words, is
<a href="//www.example.com/">
valid in XHTML 1.1 Strict? And, is it supported by many/all browsers?
Is it valid XHTML
Completely. XHTML doesn't care about the syntax of URIs. The href attribute is defined as containing CDATA.
valid in XHTML 1.1 Strict?
There is no such language.
XHTML 1.0 has Strict / Transitional / Frameset versions.
XHTML 1.1 is just XHTML 1.1. (And isn't blessed by rfc2854 for serving as text/html (which you need for IE < 9 support)).
And, is it supported by many/all browsers?
Yes. Support is fine.

Is it possible to pass w3 xhtml strict validation and still use google analytic code on my webpage?

Is it possible to pass w3 xhtml strict validation and still use google analytic code on my webpage?
Yes. If the specific code were to contain an &, >, or < (it doesn't), you would have to wrap the JavaScript in <script type="text/javascript">//<![CDATA[ and //]]></script> (ampersands are normally reserved for XML entities, and the others are for tags).
If you serve your web page using the XHTML MIME type application/xhtml+xml rather than the default HTML MIME type text/html, problems may result then. Serving pages under the latter MIME type only causes reduced cross-browser compatibilty and prevents many scripts from working. Though in the long term, I would focus on HTML5 compliance rather than XHTML compliance – that's the way further development of web standards is heading.

HTML5 syntax - HTML vs XHTML

Even with HTML5 being the path forward for HTML we get two options as developers: XHTML syntax and HTML syntax. I've been using XHTML as my main doctype for 5 or so years so I'm very comfortable with it.
But my question is given that non-xml syntax will be allowed, is there any reason to stick with a valid XML syntax? Do you gain anything going with one over another, besides preference (compatibility, etc)? Personally I'll feel a little dirty going back to not closing tags, is second nature to me now, but would I gain something going back to HTML syntax?
Update: I guess my true question is is there a reason to switch from XHTML to HTML syntax? I've been using XHTML for years and not sure if there is a reason to switch back. Browser compatibility (IE was sometimes finiky with the application/xhtml+xml mime-type), etc?
The advantage of XHTML syntax is that it is XML. It can be easily parsed, understood and manipulated. The HTML syntax is a lot harder for clients to work with.
Nonsense! The HTML5 spec defines how to parse HTML in a way that is relatively easy to implement, and off-the-shelf parsers are being developed that can be easily integrated into tool chains. It's even possible for an HTML5 parser to be integrated into an XML tool chain in place of an XML parser.
But what you need to understand is that in practice, you're most likely using HTML anyway, even if you think you're using XHTML based on the DOCTYPE. If your content is being served as text/html, instead of application/xhtml+xml or another XML MIME type, then your content will be processed as HTML.
With HTML5, you can choose to use HTML-only syntax, meaning that it is only compatible with being served and processed as text/html it is not well-formed XML. Or use XHTML-only syntax, meaning that is is well-formed XML, but uses XML features that are not compatible with HTML. Or, you can write a Polyglot document, which is conforming and compatible with both HTML and XHTML processing (In principle, this is conceptually similar to writing XHTML 1.0 that conforms with Appendix C guidelines).
I guess my true question is is there a
reason to switch from XHTML to HTML
syntax? I've been using XHTML for
years and not sure if there is a
reason to switch back. Browser
compatibility (IE was sometimes finiky
with the application/xhtml+xml
mime-type), etc?
As mentioned in a previous answer, text/html is gets parsed as HTML and application/xhtml+xml gets parsed as XML. Thus, you should use the syntax that matches the MIME type you use.
If you are now serving text/html but using XHTML syntax, then you should fix your content to use the HTML5 syntax. You may already be close, since HTML5 allows the XMLesque /> empty element syntax for void elements (elements that are always empty, such as img and br).
If you are now using application/xhtml+xml, IE support would be a reason to switch to text/html and the HTML syntax if you care about supporting IE.
Trying to write polyglot documents that are correct HTML5 and XHTML5 (for serving different MIME types do different browsers with the same payload bytes) is harder than it seems at first sight and not worth the trouble.
The HTML5 draft is very clear about which syntax to use:
use HTML syntax when sending pages as text/html
use XHTML syntax when sending pages as application/xhtml+xml
Reference: http://dev.w3.org/html5/spec/Overview.html#authors-using-xhtml
When using XHTML you can mix it with other XML content, f.e. MathML, SVG or your own proprietary format, by just changing namespace at some point. Also, you can embed XHTML inside other XML documents.
(well, actually MathML and SVG can be used in non-XML HTML5 too, but they are special-cased)
You shouldn't use XHTML to serve content on the Web (or any network including Internet Explorer clients); see Sending XHTML as text/html Considered Harmful for the full rationale.
Most of the benefits of XHTML have failed to materialise. While I wouldn't recommend it for new projects, XHTML served as text/html seems to be quite manageable and widespread, as long as you follow the compatibility guidelines. It probably isn't worthwhile changing any significant projects back to the HTML serialisation.
I like XHTML, because it forces me to write a good page. There are many advantages to XHTML, because browsers parse it faster, and you need to make well formed XML rather than just HTML. Also, you need to serve a page with the MIME Type application/xhtml+xml or you don't get any of the advantages of the X. The only problem with XHTML is that it won't display in IE8 and earlier.
The advantage of XHTML syntax is that it is XML. It can be easily parsed, understood and manipulated. The HTML syntax is a lot harder for clients to work with.
But ultimately, it is just a matter of syntax. Both forms are allowed for HTML5.
Update: I guess my true question is is there a reason to switch from XHTML to HTML syntax? I've been using XHTML for years and not sure if there is a reason to switch back. Browser compatibility (IE was sometimes finiky with the application/xhtml+xml mime-type), etc?
You have to really consider two things. The language you are writing and the language you are sending. The Web is defined by 3 components:
URI
A resource - Markup Language (document)
A protocol - HTTP (tool for managing information space)
You can write a document with an XML syntax on your desktop such as using XHTML. In this specific environment, if you give the extension ".xhtml" to the filename and open it with your local browser, it will be parsed as XML. If you give the extension ".html" to the filename, it will be parsed as HTML. Basically in your authoring tool, it is XML, but this doesn't matter anymore once you process it with a tool.
On the Web, your ressource identified by a URI will be sent with a specific mimetype, most of the time, these days, people are using text/html. The mimetype defines how the client (browser, search engine bot, etc.) must process your document. If you are using an XML syntax but send it with text/html, the document will be processed by an html parser.
For sending your documents over the wire as XML, you have to configure your server to send it as application/xhtml+xml. (Note: that IE8 and previous versions do not understand what is application/xhtml+xml and they will propose the save menu.)
The HTML 5 Abstract model has been designed in a way that you can almost write it with an html syntax or an xml syntax in text/html. Almost because even if you write with an XML syntax (closing empty elements, quotes around attributes, etc.) you will get into troubles for complex pages which are calling scripting and namespaces, due to the way XML parsers and HTML parsers deal with those.
2019 UPDATE
W3 own words about XHTML:
"A newer specification exists that is recommended for new adoption in place of this specification. New implementations should follow the latest version of the HTML specification."
So, you should use HTML 5.*

HTML 5 versus XHTML 1.0 Transitional?

It seems that HTML 5 is going to be supported (partially) by Firefox 3.1 and other browsers. It is adding support for video and audio as tags, but these are new tags that XHTML 1.0 Transitional does not recognize. What is the behavior supposed to be if I use a new HTML 5 tag in a future version of Firefox but use the DTD for XHTML? And what if I mix HTML 5 markup with XHTML 1.0 Trans?
This is getting confusing. Why didn't they just add these tags to XHTML? How do we support both XHTML and HTML 5?
Video on HTML 5: http://www.youtube.com/watch?v=xIxDJof7xxQ
HTML5 is so much easier to write than XHTML 1.0.
You don't have to manually declare the "http://www.w3.org/1999/xhtml" namespace.
You don't have to add type attributes to script and style elements (they default to text/javascript and text/css).
You don't have to use a long doctype where the browser just ignores most of it. You must use <!DOCTYPE html>, which is easy to remember.
You don't have a choice to include or not include a dtd uri in the doctype and you don't have a choice between transitional and strict. You just have a strict doctype that invokes full standards mode. That way, you don't have to worry about accidentally being in Almost standards mode or Quirks mode.
The charset declaration is much simpler. It's just <meta charset="utf-8">.
If you find it confusing to write void elements as <name>, you can use <name/>, if you want.
HTML5 has a really good validator at http://validator.nu/. The validator isn't bound by a crappy DTD that can't express all the rules.
You don't have to add //<![CDATA etc. in inline scripts or stylesheets (in certain situations) to validate.
You can use embed if needed.
Just syntax-wise, when you use HTML5, you end up with cleaner, easier to read markup that always invokes standards mode. When you use XHTML 1.0 (served as text/html), you're specifying a bunch of crud (in order to validate against a crappy dtd) that the browser will do automatically.
Myths and misconceptions abound in this thread.
XHTML 1.0 is older than HTML 5. It cannot use any new vocabulary. Indeed, its main selling point was that it uses exactly the same vocabulary as HTML 4.01.
There will be no XHTML 1.2 - most probably. And it is not needed. XHTML 5 is the XML serialization of HTML 5. Identical vocabulary, different parsing rules.
HTML has never been treated as true SGML in browsers. No browser has ever implemented an SGML-compliant parser. HTML 5 will make this fact into a rule and the HTML serialization will follow todays de facto standard. One could perhaps say that it is "SGML-ish".
As it has been stated, the DTD serves exactly one purpose IN BROWSERS, and that is to distinguish between standards compliance mode and quirks mode. Thus it affects only styling and scripting. If you are using frames on a page with astrict doctype, they will render just fine. As will <embed> and even <marquee> - even though the latter is an abomination and the former not in any current standard. It is part of HTML 5, though.
Video and audio can be used regardless of serialization, XML or HTML. they are part of both HTML 5 and XHTML 5. Once the parsing stage is over a browser will have constructed an internal DOM of the document. That DOM will be for all practical purposes the same regardless of serialization. And yes, XHTML sent with text/html is still normal html, regardless of doctype.
Well, generally speaking HTML is SGML and XHTML is expressed in XML. Because of that, creating XHTML is connected with more restrictions (in the form of markup) than HTML is. (SGML-based versus XML-based HTML)
As mentioned on Wikipedia, HTML 5 will also have a XHTML variant (XHTML 5).
Rule of thumb: You should always use valid markup. That also means that you should not use the mentioned <video> or <audio> tags in XHTML 1.0 Transitional, as those are not an element of that specification. If you really need to use those tags (which I highly doubt), then you should make sure that you use the HTML 5/XHTML 5 DTD in order to specify that your document is in that DOCTYPE.
Using HTML 5 or XHTML 5 in the given state of the implementation (AFAIK, the standard is not even settled, yet, correct?) could be counter-productive, as almost all users may not see the website rendered correclty anyways.
Edit 2013:
Because of the recent downvotes and since this accepted answer cannot be deleted (by me), I would like to add that the support and standardization process of HTML5 is nowadays totally different to what it was when I wrote this answer five years ago. Since most major browsers support most parts of the HTML5 draft and because a lot of stuff can be fixed with polyfills in older browsers, I mainly use HTML5 now.
You might be looking at the problem the wrong way because the relationship to XHTML 1.x section, HTML 5 states:
"This specification is intended to replace XHTML 1.0 as the normative definition of the XML serialization of the HTML vocabulary."
Now that language is controversial (the XHTML 2 WG has disputed it and the HTML WG is trying to resolve the differences...) but that's where we stand right now.
A couple of notes:
HTML 5 includes an XML serialization known as XHTML 5, the spec explains the differences if you're into nitty gritty details
HTML is not SGML. Henri Sivonen has done a great write up on the history of HTML parsing
As of this time (it has been a topic of debate several times), there won't be a DTD for HTML/XHTML 5 -- the Conformance Requirements section of the spec explains why a DTD isn't suitable for defining the HTML language. The HTML 5 validator also contains a wealth of information on this topic (including RELAX NG schemas for HTML5)
Keep in mind that doctypes only serve one purpose in browsers: switch between quirks, almost standards and standards mode. Therefore, using <video> and <audio> will work with any doctype declaration. IMO, using an XHTML doctype is quite useless, as every page you send with text/html MIME type is parsed as (tag-soup) HTML anyways. I suggest using the HTML5 doctype (<!doctype html>), as it is easier to remember and doesn't force you in XML syntax without a reason.
Why didn't they just add these tags to
XHTML?
They actually did, there is an XML serialization of HTML 5 (XHTML5). To use this, you have to send your pages with an XML MIME type, such as application/xhtml+xml. This is not (yet) supported by IE, though.
What is the behavior supposed to be if
I use a new HTML 5 tag in a future
version of Firefox but use the DTD for
XHTML?
And what if I mix HTML 5 markup with
XHTML 1.0 Trans?
If your markup isn't implemented as part of your chosen DTD - then logically, that markup shouldn't be followed. But browser implementations aren't always strictly logical.
Why didn't they just add these tags to
XHTML? How do we support both XHTML
and HTML 5?
xHTML is not better than HTML, but it's more suited to some applications. One of the main benefits of xHTML is that it can be transformed into different formats using XSLT. For example, you could use XSLT to automatically transform xHTML into an RSS feed or another XML format.
You don't need to support both formats - weigh up the benefits/drawbacks for each with your project's requirements. HTML 5 probably won't be standard for quite some time.
(X)HTML5 is just the next version. You should be using XHTML1.1 until XHTML5 is well-supported.
You probably should not use the backwards-compatability SGML profile of HTML5. It makes things harder for scrapers and small parsers.
Your doctype will tell the browser whether you're using HTML5 or XHTML. You can't just shove a tag from one doctype into a document of another doctype and expect it to work.
Without a doctype, it's all just tag soup anyway.
Don't use things like video/audio tags when 99% of people won't be able to view it properly on their browser. For either of these two examples I'd suggest using FLV.
As far as why they don't add it to XHTML... firstly 1.0 isn't the most recent version, 1.1 was released a while ago.
Eventually things get standardized and we'll see these types of tags in both standards, but for now just do what you can to ensure the most amount of people can view your content.

Resources