Adventures in HTML5. Part one: Using HTML5 now

There’s no real reason I’m aware of for not using an HTML5 DOCTYPE and HTML5 tags (with the exception of audio and video, which can be fixed). So that means the first line of any HTML document should be:

<!doctype html>

Which is reason enough to start using HTML5. But it also means that instead of structuring pages with:

<div id="header">
... <div id="nav">
<div id="content">
... <div class="post">
<div id="sidebar">
<div id="footer">

You can use:

<header>
... <nav>
<article>
... <header>
... <section>
... <footer>
<aside>
<footer>

Which is very exciting (and radically different from the last seven or so table–free years of markup). At last we can structure documents with real tags. There will be less <div>s in the world, which is a good thing.

Browser support

All modern browsers interpret HTML5 tags and allow CSS to style aside, header etc., and you can force Internet Explorer to join the party with a small snippet of javascript. I hotlink to some Google Code–hosted javascript by adding the following to the head section of HTML pages:

<!--[if IE]><script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script><![endif]-->

(Note the lack of a script type. That’s one of the many great things about HTML5: It’s stripped out lots of (X)HTML verbosity, which makes it a lot more friendly.)

Which means the following browsers and user agents don’t interpret HTML5:

  • Internet Explorer with javascript turned off
  • Screen readers

Which isn’t a problem because one audience is going to be absolutely tiny (is there any reason to browse in IE with javascript turned off?) and the other will simply ignore the HTML5 tags (another great feature of HTML5 is the graceful degradation. Browsers adopt the standard at different rates, but most of the time the fall back option is perfectly acceptable).

Most HTML5–specific tags are currently implemented by (ab)using the virtually meaningless div (div groups bits of content arbitrarily; it doesn’t indicate why these bits of content have been grouped). As screen readers do nothing with divs it makes no difference if we use header or article instead of div id="header" or div id="content". And once screen readers start to understand HTML5 users will begin to benefit from more meaningful documents. For example, screen readers could separate navigation from content by looking at the markup contained between nav tags.

Caveat

As ever, all is not completely simple. The problem HTML5 poses to screen readers is not the introduction of a new set of tags, but in the way it handles h1-h6. In previous HTML specs, a document could only consist of 1-6 discrete headings it made sense to only use one top level heading because it would mark up the document’s title, and documents normally one have one title. So your document would use h2s, h3s et al to head other pieces of content.

In HTML5 each node can have its own hierarchy of headings (so each header, footer, article and aside can have an h1), although it doesn’t have to (HTML5 is a good friend because it’s so flexible) Looking at our traditionally marked-up page, it would be structured along the lines of:

<div id="header">
<h1>Site title</h1>
... <div id="nav">
<div id="content">
... <div class="post">
... <h2>Document title</h2>
<div id="sidebar">
<h3>Sidebar section title</h3>
<div id="footer">

Whereas our HTML5 page can be structured like this:

<header>
... <h1>Site title</h1>
... <nav>
<article>
... <header>
... <h1>Article title</h1>
... <section>
... <footer>
<aside>
... <h1>Aside section title</h1>
<footer>

Which is logical, but means that screen readers (which use the traditional method) just see a bunch of first level headings, which corrupts the structure of the document.

It’s difficult to suggest a way round this. HTML5 is our flexible friend, so you can continue to use a strict h1-h6 structure, but it strikes me as illogical, considering the new DOM structure which allows us to have document level footers and footers within articles.

So are you using HTML5 on a day to day basis?

Or is it still too experimental for real client work?

10 thoughts on “Adventures in HTML5. Part one: Using HTML5 now

  1. Florent V.

    Nice article. But: http://xkcd.com/386/

    First, i’m not sure where you got the idea that HTML4 says you can’t have a bazillion H1 elements in your document. It doesn’t. It doesn’t require that you use one, or just one, or twelve, or anything. The “one H1 per page” rule is actually not a rule, and comes from SEO (and even from a SEO perspective, it’s kinda wrong).

    Then, what you write about screen readers is true. The logic of section headings in HTML5 borrows from XHTML2 (hey, looks like it’s not totally dead after all), where the structure of sectioning elements determine the “real” level of a heading. XHTML2 had just a H element (no number), but HTML5 is keeping H1-H2 for backwards compatibility. So in the end the spec advises two different options for writing good document outlines:

    1. Use only H1, H1 frigging everywhere, and SECTION and other sectioning elements (hum, ARTICLE, BLOCKQUOTE, a few others) for specifying the level of the title and relevant content in the outline. Aka XHTML2-style.
    2. Use H1-H6, ignoring sectioning elements. Aka HTML4-style.

    All this is nice, but i suspect we’re in for some major wreckage, as:

    - Authors already have trouble getting their HTML4-style headings correctly. HTML5 is bound to confuse them, especially with two ways of doing things. I predict we will continue to see pages with bad outlines, and may see even more of them in the future.
    - Screen reader publishers are awfully slow sometimes to support basic stuff like, hum, HTML4. Don’t hold your breath while they don’t update their software to cope with HTML5 sections/outlines better. Oh, and if authors start using HTML5 but mess up their outlines, there won’t be much of an incentive to support proper outlining algorithms.

    This is going to be fun. :)

  2. David Oliver

    While I’m not working on sites which require the extra web apps functionality that HTML5 will offer (but which isn’t yet very well supported by desktop browsers), I’m going to stick to XHTML 1 Strict and using sensible id and class names.

    Thanks for the read.

  3. Leon Post author

    @Florent — many thanks for taking the time to comment (and the pic—I’m not that wrong, am I? :) )

    I’ve corrected the sentence about how HTML4 handles headings. Thinking about this, I guess people equate the document’s title with h1 and, as a document should logically only have one title, it follows that it should only have one h1. But I appreciate there’s nothing in the spec about this usage of h1, or how many h1s you can place in a document.

    Document authors will have to choose which heading scheme they follow. As long as it’s one or the other HTML5 shouldn’t cause more bad structures. If I was being really optimistic I might even argue that it’ll get authors thinking about structure, especially as they’re not simply wrapping content up in divs all the time.

    I take your point about screen readers’ adoption of HTML specs. I guess a href="#content" will be with us for some time.

    Use only H1, H1 frigging everywhere

    Yes, this feels plain odd; I’m going to list a few of the problems I’ve had with HTML5; this is one of them.

    Once again, thanks for your comment.

  4. Leon Post author

    @David Oliver — thanks for reading.

    I’d personally go with article (or even article id="content") as it means more than div id="content". I don’t think it’s the new tags that will cause problems, except for when the spec changes: And that’s not going to change the user’s experience of your page.

  5. David Oliver

    I used to think of h1 as being for the document title (e.g. the company name for a corporate site), but I now regard it as simply the first level of heading (subheading in the context of a page), meaning that I use a new h1 for each new primary section of content. The document title, and only the document title, is the title, as it were. :)

    Regarding using the article tag, it’s only for external content: http://www.w3schools.com/html5/tag_article.asp – did you mean the section tag? I wouldn’t use that in XHTML as it means the document wouldn’t be valid according to my chosen document type. I’ll simply carry on trying to describe content using the minimum number of divs possible.

  6. Leon Post author

    @David — thanks again.

    Yes, you’re right about h1; I guess I was referring to convention rather than the spec itself.

    As for article I think it demonstrates another difficulty with HTML5, namely how to interpret more abstract structural tags. After all, paragraphs, headings etc. are not open to interpretation quite as much. However, I read article as content that could be syndicated externally. Just having a tag that marks up external content would be rather limiting:

    The article element represents a component of a page that consists of a self-contained composition in a document, page, application, or site and that is intended [my emphasis] to be independently distributable or reusable, e.g. in syndication. This could be a forum post, a magazine or newspaper article, a blog entry, a user-submitted comment, an interactive widget or gadget, or any other independent item of content. HTML5 Core and Vocabulary

    (PS—It would have been nice to have used the mark tag in there, but I don’t think WordPress would parse it).

  7. David Oliver

    I see what you mean, Leon – thanks. The W3C Schools site doesn’t go into much depth, so I think I shall have to stop relying on it for interpreting the specs.

  8. Leon Post author

    @David — I was surprised by that entry at W3C Schools; it seems quite a long way from the actual spec. I had a terrible moment where I thought I’d got things completely wrong.

    section is more problematic, I think.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>