Siding with HTML over XHTML, My Decision to Switch

The HTML vs. XHTML discussion has been lingering for an extensive amount of time. There have been many solid arguments put behind each, and more often than not, the consensus from many authors isn’t to choose one over the other, but to justify your choice. The idea was to make an educated decision and stick with it. I’d like to skip the formalities in explaining the fundamental differences between HTML and XHTML in favor of a discussion surrounding the characteristics which helped shape my decision to use HTML after years of writing XHTML.

My history with XHTML

For a long time I wrote XHTML without a second thought. I loved the additional rules and restrictions with writing syntax, mostly because I was so used to seeing poorly written markup so often. XHTML was so much prettier, more organized, nicer to work with; it had to be. I liked the fact that XHTML needed to be well-formed, and I always wrote my markup in lowercase anyway. It seemed like the additional rules were aimed at authors much like myself. I was quick to adopt XHTML as my language of choice.

On top of that, a lot was being written about XHTML. Many of the designers and developers helping me learn more about my trade were publishing XHTML. Many were vocal about their choice, listing supporting factors in comparison to HTML. It’s comforting to see those you look up to sharing the same techniques, so writing XHTML seemed like a great fit.

Recently, however, I’ve been taking the decision between HTML and XHTML a bit more seriously. I’ve taken some time to examine the differences between the two technologies and how I can apply those differences to what I know. There are fundamental differences between XHTML and HTML, and it’s important to make an educated decision as to which you’ll use.

What prompted my switch to HTML?

Using XHTML began to make less sense for me. The fact that writing XHTML requires an author to follow more rules was a personal preference, which turned out to be partially inappropriate (in my opinion). I wasn’t considering the grand scheme of things; the Web in a general sense.

What started to get under my skin was the fact that I spent a lot of time writing XHTML, but serving it as text/html. Why? I was serving text/html because the servers to which I was publishing are configured similarly to 99% of the other servers powering the Internet. They’re not configured to serve XHTML as application/xml or application/xhtml+xml. We (as designers and developers) should be happy about that, however. Internet Explorer (including 7 and 8) will not support XHTML served as application/xml or application/xhtml+xml. IE is by far the reigning browser champion and will remain so for a bit of time. The fact the IE doesn’t support our use of XHTML is a major show-stopper, don’t you think? To circle back, it started to become irritating that I was publishing a document in one way, and serving it in another.

Just to be clear, I was publishing XHTML 1.0 which (technically) can be published as text/html, but in a way it seemed counterproductive to me. Additionally, that ability (probably) won’t be available in future revisions of XHTML. More on that later.

Incorporating client work

I try to base many of my opinions on real world applications as much as possible. To be more explicit; I try to apply things to client work instead of personal projects. When we’re writing our own documents, of course things are going to work out how we planned (at least we hope so, and we’ll work on it until they do). When it comes to client work (and therefore ‘the rest’ of the Internet) things are quite different.

Publishing content on the Web is in no way limited to professional developers or designers, much of the reason the net is so active is because anyone can make a website. Sure, we (as knowledgeable professionals or hobbyists) all hope to make the Web a better place by doing our part in publishing documents with semantically rich, valid markup, but the reality is that those documents are rare. It’s important to keep in mind the true nature of the Internet; an open platform for information sharing.

The reason I raise that point is because all the the work I do during my workweek is to build websites for clients to use. They’re the ones managing the content, controlling what happens to their website after it’s pushed live and all questions about the CMS have been answered. It didn’t feel right to me that the documents I was handing over had this strict set of rules embedded within them. Of course, technical details such as what DOCTYPE was being used was hardly a topic of conversation, but that fact remains that clients were given the ability to alter the markup of their website via WYSIWYG editor.

Sure, the editor we use boasts valid XHTML (like many WYSIWYG editors), but more often than not it would only be a few days before markup errors would sprout up throughout the website. Had the site been served with the proper MIME type, my office would be getting phone calls with every update the client tried to make (which in turn prevented the document from loading); defeating the purpose of giving them access to a CMS.

Publishing XHTML, and therefore imposing the associated strict ruleset, didn’t seem like the best solution any longer.

The opinions of browser vendors

Some time ago, I came across an article which included some great links to publications by major browser vendors outlining their preference concerning HTML vs. XHTML:

There is some great information provided in those articles, especially the interview with Håkon Wium Lie. A quote from that interview, which has been included in many articles surrounding the HTML vs. XHTML debate, is worth reading at least twice:

XHTML2 has some very good ideas that I hope can become part of the web. However, it’s unrealistic to think that all web authors will switch to an XML-based syntax which demands that browsers stop processing the document on the first error. XML’s draconian policy was an attempt to clean up the web. This was done around 1996 when lots of invalid content entered the web. CSS took a different approach: instead of demanding that content isn’t processed, we defined rules for how to handle the undefined. It’s called “forward-compatible parsing” and means we can add new constructs without breaking the old.

So, I don’t think XHTML is a realistic option for the masses. HTML 5 is it.

I seem to lean quite a bit toward Mr. Lie’s last statement: “… I don’t think XHTML is a realistic option of the masses. HTML 5 is it.” That’s a very powerful statement coming from someone for which we all have a deep respect. His sentiment ties closely with my experience in Web development, especially with client projects.

Looking to the future

Rereading a piece like The Road to XHTML 2.0: MIME Types was also a major reality check. One of the major take home notes from that article is: “Although the spec is not finalized yet, all indications are the XHTML 2.0 must not be served as text/html. That’s a major implication which could have an adverse affect on your workflow.

It’s also interesting to compare XHTML 2.0 to HTML 5. From the HTML 5 spec:

XHTML2 defines a new HTML vocabulary with better features for hyperlinks, multimedia content, annotating document edits, rich metadata, declarative interactive forms, and describing the semantics of human literary works such as poems and scientific papers.

However, it lacks elements to express the semantics of many of the non-document types of content often seen on the Web. For instance, forum sites, auction sites, search engines, online shops, and the like, do not fit the document metaphor well, and are not covered by XHTML2.

This specification aims to extend HTML so that it is also suitable in these contexts.

XHTML2 and this specification use different namespaces and therefore can both be implemented in the same XML processor.

HTML 5, in general, seems to be a more attractive solution taking into account the work I find myself doing most often. I hope to soon devote more of my personal time to following the development and implementation of HTML 5 as it matures over the coming years.

Taking a step back

I realize that the bulk of this article outlines some faults I’ve personally found with using XHTML over HTML. I’m partially disappointed in that, but on the other hand it was a matter of a decision I had made that needed to be disproved in order for a change to happen. I didn’t want to make this choice lightly, I wanted to educate myself more on the core differences between the two and try to determine the better choice for me. I wanted this article to be based upon my personal reasons for switching, as opposed to a piece outlining the pros and cons of each. We can all accept that there are definite pros and cons to whichever decision you make.

I know I’m not alone in saying that I prefer the restrictions imposed by writing XHTML over HTML. That hasn’t stopped me from continuing to pay attention to the details of my markup, even if it is HTML. I still write everything in lowercase, and I continue to keep everything well-formed. Although end tags aren’t required, I can’t help but to include them (where applicable). I still quote my attributes values as well. At the end of the day, the only true noticeable difference is the lack of self-closing tags and a different DOCTYPE.

You should definitely have an opinion regarding your decision to use HTML or XHTML, and it would be great if you’d take a minute to leave your thoughts in the comments following this article. I’d love to hear some applicable reasons supporting your choice, based on your experiences as well as the requirements of your workload. If you’ve got some things to share, please take a minute to leave your mark and/or respond to someone else’s.