You may read this straight through, or skip to one of these main sections:
Up until now, you have been experimenting with HTML by putting tags into your documents, saving them, and viewing them in a browser. Once the page looks OK, you move on to the next page.
When you’re just starting out with HTML, this is a great approach. However, if you intend to do web design on a professional level, you can run into problems.
Most browsers have been written with the assumption that the people who write Web pages will make a lot of mistakes. If you write something that isn’t according to the rules, the browser will do its best to make sense out of it and display it for you.
For example, you really shouldn’t have a list item (<li>)
outside of an unordered list (<ul>), but if you put it
on your page, the browser will show you a bulleted item anyway.
When XHTML isn’t written right, the browser has to make a decision of what
it should look like.
Let’s say you have a flat tire. Let’s say that when you put on the spare tire, you attach it with only only two of the four nuts that hold the wheel in place. After you put the hub cap back on, it will look OK, and it’s good enough to get you to the tire store to buy a new tire.
Of course, you don’t do that, because if you keep driving with the wheel held on that way, you can expect trouble in the long run. Certainly, no professional car mechanic would attach a tire that way.
Similarly, if you write “tag soup” it may look OK in one browser, but not work well in other browsers. When the next set of brand-new browsers get released, they may make different decisions about how to handle your bad tags, so your page might look different.
That’s why you want to write valid HTML—so that you make the decisions, not the browser.
The word “valid” is just a fancy way of saying that you are following all the rules set up by the people who designed HTML. These are rules like: “if you want a list item, it had better be inside a list” or “every opening tag has to have a closing tag.” The folks at the World Wide Web consortium have set up a web page that will validate your pages—tell you if you re following the rules correctly.
In order for the validator to do its job correctly, you have to tell it three things:
You can tell the validator which version of HTML you are using by putting
this line in as the very first lines of your file. It goes
even before the opening <html> tag.
<!DOCTYPE html>
You tell the validator what the document’s main language and XML
“namespace” is by
adding attributes to the opening <html> tag.
<html xmlns="http://www.w3.org/1999/xhtml"
xml:lang="en" lang="en">
You tell the validator what “character set” (English,
Russian, Vietnamese, etc.) your document
uses by putting the following line right after the
opening <head> tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Once you have your document ready, go to
http://validator.nu/
The part of the page we
are interested in lets you upload files from your computer.
Set up your screen so it matches the following screenshot.
Choose File Upload.
Click the picture to see a larger view.
First, click the Browse... button to select the file you want to
validate. This will bring up the standard file chooser dialog.
Once you locate the file, click the Validate button, and the
validator will tell you if your file is valid or not.
If your page is valid, you’ll see something like this. If you have warnings, do not ignore them—read the warnings and fix the problems!
If you have a mistake in your file, like this one:
<p> This is a paragraph where we <i>turned on italics, but forgot to turn it off! </p>
You will get error messages from the validator. See the result of validating this bad file.
Make sure your every opening quote has a closing quote,
and every opening < has a closing
>.
<img src="joe.jpg alt="Picture of Joe" /> <p Have a good one!</p>
<img src="joe.jpg" alt="Picture of Joe" /> <p>Have a good one!</p>
If you misspell a tag name or an attribute name, the validator will
say it’s wrong. (The browser will just ignore that element or
attribute.) Beware especially of using scr instead of
src in your image elements! Remember, the src
attribute tells the source file for the graphic!
<image src="joe.jpg" alt="Picture of Joe" /> <image scr="joe.jpg" alt="Picture of Joe" />
<img src="joe.jpg" alt="Picture of Joe" />
The rest of these errors are ones that the browser will overlook; it will just try to do the best it can. The validator, on the other hand, is a big meanie, and won’t let you get away with them!
XHTML is case-sensitive; all element and
attribute names must be lowercase. HTML5 doesn’t care;
you can mix upper and lowercase as much as you like. However,
to be consistent, stick with the XHTML syntax, and use
lowercase only. The letter
E
in Example
can stay uppercase because it is the attribute value,
not the attribute name.
<OL Class="Example"> <li>item one</li> <li>item two</LI> </oL>
<ol class="Example"> <li>item one</li> <li>item two</li> </ol>
If you have nested elements (one element inside another), you must end the inner element before the outer one.Browsers do their best to display improperly nested HTML; the validator will reject any HTML document that has a nesting error.
<b>Outer and <i>inner</b> elements</i> nested incorrectly.
<b>Outer and <i>inner</i> elements</b> nested correctly.
<a href=page2.html> <a href="page3.html" id="b" href="abc.html"> <a href="page4.html"id="c">
<a href="page2.html"> <a href="page3.html" id='b'> <a href="page4.html" id="c">
In XHTML, any element that contains text between the opening and closing tag (like paragraphs, bold, italic, list items, etc.) has to have both tags. In HTML, many (but not all) opening tags have optional closing tags. Again, rather than have you memorize which ones are optional, always use closing tags. Then you don’t have to worry.
<p> Paragraph one <p> Paragraph two
<p> Paragraph one </p> <p> Paragraph two </p>
What, then, are we to do with elements like <br>
and <img>, which don’t contain text?
They still need closing tags, so we can do one of
two things: we can put in a closing tag, or we can use a
“shorthand form” by placing a / before the
> of the element, as in the following examples.
<br></br> <br /> <img src="joe.jpg" alt="Picture of Joe"></img> <img src="joe.png" alt="Picture of Joe" />
You’ll note that we’ve put a blank before the slash; this keeps really old browsers from freaking out when they encounter one of these shorthand elements. In HTML syntax, you can leave out the closing slash, but to be consistent, we will use the slash as if it were XHTML syntax.
The less than sign is special—it tells the browser that you are
about to start a tag. The ampersand symbol (&) is also a special
symbol for HTML.
You can’t put a < or & directly into
the text of your document when you are using
XHTML syntax. You must instead use < and
&. And yes, the semicolon at the end is
required! You don’t have to write a greater than sign as
>, as it never causes any ambiguity.
However, we recommend that you do so; this will keep your markup
looking symmetrical.
The HTML syntax will sometimes let you put in an ampersand
all by itself. Again, rather than having you memorize the
conditions when you can or can’t do this, always use
& it is guaranteed to work.
<p> He & I graphed the inequality x + 3 < y </p>
<p> He & I graphed the inequality x + 3 < y </p>