Pin Me

Converting XHTML to HTML5

written by: Waleed Zuberi•edited by: Amber Neely•updated: 8/26/2010

HTML5 seems all the rage these days. However as with most buzzwords on the web, there seems to be a little fear, uncertainty and doubt when it comes to implementing it on our own sites. The following article will guide you through converting a simple XHTML document to HTML5.

  • slide 1 of 8

    Is it feasible?

    Apple’s decision to support HTML5’s multimedia capabilities over the more popular Adobe Flash has perhaps made the biggest splash in favor of the new web standard. With browsers touting their support for it and many video sharing sites like YouTube implementing it, HTML5 seems all the rage these days.

    But first let’s talk about whether or not you really need HTML5 right now.

    It’s obviously the future of web design and development, it offers great features and takes some of the bite out of the developer’s job – but is it feasible? Should you run out and change all your websites to HTML5 right away?

    Well, not exactly. The first thing to realize is that it is in no way a requirement. XHTML1.0 and even HTML4 have been serving perfectly well for the last decade or so and there’s no need to suddenly give in to the hype just because everyone else is doing it.

    question The decision to convert needs to take several things into account; such as development costs and most importantly, browser support. Even after the HTML5 specs are finalized, browser vendors will take some time to catch up. Each new browser release adds more support for the various new features in the new HTML spec sheet, but many versions currently have buggy or no support for several of the most exciting things, such as editable content and video.

    The new HTML5 features may be moot depending on what type of browsers your average site visitors use so it’s a good idea to look over your site statistics to gauge what type of audience you get. The worst culprit is of course IE6, with next to no support for HTML5 – it is a nine year old browser after all! You can take a look at this frequently updated chart to see which browser version supports which spec, or you can test your own browser.

    To sum up, yes you can use HTML5 right away, but as always, think before you leap.

    Even with this seemingly crippling blow, there are ways to ensure graceful degradation. Each section below will highlight its own compatibility report.

  • slide 2 of 8

    Cleaning Up

    HTML5 is all about clean, efficient and semantic code. The first thing you see when looking at our example XHTML page is the ugly DOCTYPE declaration, which flies directly in the face of our noble goals. Here it is again:

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN""http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

    DOCTYPE declarations have always annoyed many a web developer. They’re impossible to memorize and very easy to mistype. But they are absolutely necessary to ensure proper rendering across all browsers and as such need to be the first thing on any webpage. Good thing the DOCTYPE in HTML5 is a little better:

    <!DOCTYPE html>

    And that’s it!

    Next up is the <html> tag. Right now it looks like this:

    <html xmlns="http://www.w3.org/1999/xhtml" lang="en">

    The xmlns (XML Namespace) is unnecessary in HTML5 because it is assumed to be true even if not explicitly stated; therefore it can safely be removed.

    In the <head> section we have some character set information and a link to the CSS stylesheet. And of course, the title. First up is the charset tag, which can be shortened to simply

    <meta charset="utf-8" />

    Then we have the stylesheet link. You’ll notice it explicitly defines itself as a CSS stylesheet; but what other kinds of stylesheets are there, really? None! So the type attribute can be discarded to make

    <link rel="stylesheet" href="style.css" />

    The same goes for JavaScript links; you can safely remove the type attribute.

    You can make any of the above changes right now without having to worry about browser compatibility or rendering issues. Over a large project, the (seemingly) small changes made above can also help reduce file size and in turn server load and bandwidth usage.

  • slide 3 of 8
    Converting a simple XHTML document to HTML5, one element at a time.
  • slide 4 of 8

    The Semantics

    So far all the changes we’ve made to our HTML page have been about removing clutter and making our code easier to read, write and maintain.

    The changes we are now going to make are about semantics, i.e. making our code make more sense. This is perhaps a little more exciting than cleaning up existing code, but comes with a small, rather annoying caveat. Let’s convert the page first.

    First up we have the header <div> that contains a logo and navigation links. As you can see, we’re using additional <div>s with appropriate IDs.

    <div id="header">

    <div id="logo">

    <h1>My Awesome Site</h1>

    </div>

    <div id="nav">

    <ul>

    <li><a href="#home">Home</a></li>

    <li><a href="#about">About me</a></li>

    <li><a href="#contact">Email me!</a></li>

    </ul>

    </div>

    </div>

    Let’s get semantic on its…err…head. That becomes this:

    <header>

    <div id="logo">

    <h1>My Awesome Site</h1>

    </div>

    <nav>

    <ul>

    <li><a href="#home">Home</a></li>

    <li><a href="#about">About me</a></li>

    <li><a href="#contact">Email me!</a></li>

    </ul>

    </nav>

    </header>

    Doesn’t that make more sense? A <header> contains our title (the logo) and our page <nav>.

    After that there’s the main <div> that holds our page content and a sidebar for additional links. Here’s what the HTML5 would be:

    <div>

    <aside>

    <p>Extra links, additional information, secondary navigation etc.</p>

    </aside>

    <article>

    <h2>Content title</h2>

    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque eget bibendum sapien. Nulla aliquet hendrerit suscipit. Vivamus consectetur consequat purus non laoreet. Suspendisse mattis placerat dolor, at gravida mi mattis ut. Maecenas egestas gravida turpis non molestie.</p>

    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque eget bibendum sapien. Nulla aliquet hendrerit suscipit. Vivamus consectetur consequat purus non laoreet. Suspendisse mattis placerat dolor, at gravida mi mattis ut. Maecenas egestas gravida turpis non molestie.</p>

    </article>

    </div>

    You’ll notice here and above, I’m still using <div>s; there’s nothing wrong with that! A <div> still means what it did before HTML5 – a division. It’s just that with HTML5 we have more semantic tags to replace some of the <div>s in our designs.

    Then comes the footer, which naturally turns into this:

    <footer>

    <p>&copy; 2010</p>

    </footer>

    So here we have it. A rather smart looking, semantic and tidy HTML5 page. We’re almost done, but there’s one problem left to be dealt with. Head over to the next page to see it and the simple solution.

  • slide 5 of 8
    Fixing some HTML5 rendering issues for IE6, a known problematic browser.
  • slide 6 of 8

    The Problem

    Since the new tags available in HTML5, such as <header>, <nav>, <footer> and <article> are new, old browser versions don’t recognize them; and since each browser has its own way of dealing with unknown elements, this can potentially cause our new HTML5 page to break miserably, especially in Internet Explorer.

    To illustrate, let’s look at the following code:

    <!DOCTYPE html>

    <html lang="en">

    <head>

    <meta content="charset=utf-8" />

    <link rel="stylesheet" href="style.css" />

    <script src="jscript.js"></script>

    <title>A Simple Page</title>

    <style type="text/css">

    body { width: 700px; margin: 0 auto; }

    header { border: solid 2px red; }

    article { margin-right: 270px; }

    aside { float: right; width: 250px; background: #efefef; }

    footer { clear: both; background: #000; color: #fff; }

    </style>

    </head>

    <body>

    <header>

    <div id="logo">

    <h1>My Awesome Site</h1>

    </div>

    <nav>

    <ul>

    <li><a href="#home">Home</a></li>

    <li><a href="#about">About me</a></li>

    <li><a href="#contact">Email me!</a></li>

    </ul>

    </nav>

    </header>

    <div id="main">

    <aside>

    <p>Extra links, additional information, secondary navigation etc.</p>

    </aside>

    <article>

    <h2>Content title</h2>

    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque eget bibendum sapien. Nulla aliquet hendrerit suscipit. Vivamus consectetur consequat purus non laoreet. Suspendisse mattis placerat dolor, at gravida mi mattis ut. Maecenas egestas gravida turpis non molestie.</p>

    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque eget bibendum sapien. Nulla aliquet hendrerit suscipit. Vivamus consectetur consequat purus non laoreet. Suspendisse mattis placerat dolor, at gravida mi mattis ut. Maecenas egestas gravida turpis non molestie.</p>

    </article>

    </div>

    <footer>

    <p>&copy; 2010</p>

    </footer>

    </body>

    </html>

    The only thing different here from our completed document on the previous page are the CSS styles. Here's what it looks like in Google Chrome.html5-chrome Excuse the shoddy design, please.

    But look at it in IE6 (or for that matter, even in IE8), it looks very strange. Almost as if no CSS had been applied!

    in IE6 

    Why is that happening, you ask? Because the CSS styles were set for <header>, <aside> and <footer>, which IE (up to version 9) does not recognize, and so does not render properly. IE has a very large market share and cannot be ignored – even now many people are still using IE6 (which was released in 2001) and even with the coming release of IE9, this problem will not likely go away.

  • slide 7 of 8

    The Solution

    So how do we fix it? For IE version 7 and 8, Firefox 3 and Opera 10, a single CSS rule will do the trick. Chrome, Safari and other Webkit browsers don’t have this problem to begin with.

    header, nav, article, footer, aside, address { display: block; }

    However, to fix the issue in IE6 we will also need some JavaScript.

    <!--[if lt IE 9]>

    <script type="text/javascript">

    document.createElement("article");

    document.createElement("footer");

    document.createElement("header");

    document.createElement("aside");

    document.createElement("nav");

    </script>

    <![endif]-->

    What this basically does is create fake DOM elements (only in IE thanks to the conditional tags), which allows proper rendering of CSS styles.

    To make things even easier, Remy Sharp released a thorough HTML5 enabling script which you can include in your pages. The advantage of using Remy’s script is that it’s cleaner, more thorough and also fixes an issue where printing the elements does not work. To use it, simply replace the above JavaScript with this:

    <!--[if lt IE 9]>

    <script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>

    <![endif]-->

    And with that, you’re done. Cross-browser compatible, semantic HTML5. Doesn’t it feel good?

  • slide 8 of 8

    Looking Back

    This article has been about converting our web page to HTML5. We started by talking about whether or not it's even a good idea and looked over some browser-compatibility charts. Then we tidied up our "messy" page with clean HTML5, replaced simple <div>s with proper, semantic tags and later fixed an issue for older browsers with CSS and some JavaScript.

    Before you start the process on your own website, think about the type of audience your site gets and what sort of browsers they tend to use. For example if the majority of your visitors have JavaScript disabled for some reason, then it would be a bad idea to go ahead with the conversion. Some of the more advanced features HTML5 has to offer such as video and graphics have sketchy support even in the latest browsers, so more caution is needed when implementing them.