20 December 2009

HTML and CSS for ASP.NET Developers:
Part Two, Semantic HTML

About the series

This is a series about basic HTML and CSS knowledge. It is written for those of you who have experience with ASP.NET Web Forms, but feel your knowledge of HTML/CSS has fallen behind.

An important foundation of elegant web pages is semantic HTML. It is an ideal that is based on the separation between content and presentation. Ideally the HTML code should only take care of the content part, and the CSS code should handle the presentation. But as always reality will make things more complicated. CSS is unfortunately not powerful enough to style the content without being dependent on the HTML. One simple example of that is rounded corners, which often is implemented as div-elements positioned in each corner, each having a background image of the specific rounded corner.

Supporting semantic HTML seems to have had low priority when Web Forms was developed. Back then things were very different. It was harder to use CSS and some felt they could only realize their designs by using tables to structure the content visually. They sacrified the semantics in favor of the power the table element gave them. Of course they knew their body column wasn't a table column, but they decided it didn't matter. And who am I to blame them? Things were really harder back then and sometimes you just have to be pragmatic. Today it's easier to style semantic HTML and the more competent the browsers get, the easier it gets.

Since Web Forms generate HTML from another age (with a lot of nested tables and other obsolete habits), you can not trust it when you develop modern web pages. You need to take back the control of the markup. Only trust in HTML you have written, generated or inspected yourself. To do this you will have to write your own, or modify existing server controls. You can also use control adapters. Or do as I do: use the Repeater or similar controls that let you control the HTML via templates.

Learn the tags

Semantic HTML is dependent on a conscious use of its different types of elements. By using ul elements for unordered lists, a elements for links, h1 elements for the main header, p elements for paragraphs and so on, the structure of the document (and a web page is a document) will be captured on a different level than purely visual. And I can assure you, to have this structure under control makes things a lot easier in the long run. It's a solid foundation to build upon.

When you are uncertain of how a specific element is supposed to be used, just google. There are a lot of examples on the web. For me, it's been the primary source of information—and it still is!

To learn semantic HTML is not rocket science. The key is to learn to separate content and presentation. When you get the hang of it, it's simply a matter of learning a lot of simple rules and to learn about the different elements.

A concrete example—Green Company

Let's get our hands a bit dirty! I feel the best way of showing what semantic HTML is about is to look at an example design and tag the different parts of it.

Imagine you will create a simple site for a company called Green Company. The designer proudly sends you an image of what a basic page should look like. Now, what do you begin with? When I began writing web pages the first thing I saw was what the page looked like. Now I try to look at the document structure first and write it down as HTML.

To begin with I divide the page into sections. Those sections will be tagged as div elements, since there are no other element types today (in HTML 4/XHTML 1) meant to capture the comprehensive structure of a page.

Most pages consists of a header, a content area and also often a footer. I usually place these elements in a div representing the whole page. Some might find this element superfluous, but I have found it useful both for layout purposes and to separate its content from elements introduced by Web Forms, scripts and other sources.

The sketch the designer sent you describes a fairly common page structure. It contains a header and a content area, but no footer. The header contains a logotype and a main menu, while the content area contains a body and a news area. Since all elements (except the logotype and the main menu) are div elements, they have to be identified via the id attribute. This is what the HTML code of the structure would look like:

<div id="page">
  <div id="header">
    <h1 id="logotype"></h1>
    <ul id="main-menu"></ul>
  </div>
  <div id="content">
    <div id="body"></div>
    <div id="news"></div>
  </div>
</div>

The structure above would be a good candidate to put in a master page (depending on the other types of pages the site will contain you might want to divide it into several nested master pages). I will not dig deeper into this example in this post, but I hope I have stressed the importance of writing semantic HTML and to analyze the structure of the page.

Divitis and Classitis

It takes a while to learn which (X)HTML elements that exists, and I have to admit that I still search on the net sometimes to find the right element to use in a specific situation, even if I have been a professional web developer with focus on HTML/CSS for years. (For example, I had a vague memory of a blockcode element I intended to put the html code above in. But it is not part of XHTML 1.0 Strict, so I used a pre element instead.)

Before you have learnt the different element tags it is common to overuse the div and span elements and give the elements a class or an id to tell what it does. For instance the not too graceful <div class="header"> is common. Use a header element instead! This is appreciated by screen readers, search engines and eventually even yourself, because good semantic HTML is a well thought out structure represented in a clear language easily read by those who knows it.

The overuse of div elements and classes is often called divitis and classitis. These terms is said to have been coined by Jeffrey Zeldman in “Designing With Web Standards”.

Beyond anorectic HTML

Some seem to use a machine gun spewing out elements like a maniac when they are implementing a web page. I didn't want to be like that and I strived a lot to write the most minimal HTML code you could imagine, while I cockily diagnosed a lot of other developers as suffering from divitis and classitis.

I had worked a couple of years when I suddenly realised my elegant minimal HTML code actually didn't work too well. It was hard to style, since CSS is not powerful enough, and I also realised a couple of extra divs could make my code easier to follow.

Nowadays I am not too afraid of adding some extra div elements with an id or class to explain its purpose. The page div I often use is an example of this.

Avoid classitis and divitis, but do not produce anorectic HTML.