Well, I've just had to convert another HTML 4.01 Transitional website over to XHTML (eXtensible HyperText Markup Language) 1.0 Transitional, I will later on convert over to XHTML 1.0 Strict as soon as I write the CSS (cascading stylesheet) file for it but it had to be a quick update and removing all presentational elements and placing them in a CSS file is not quicker than just altering some tags. So this is where I got the idea to write a How to Convert HTML over to XHTML.
Let's begin
I'm having trouble knowing where to start so let's write us a valid HTML 4.01 Transitional page with most elements that require alterations (purposely made mistakes) and then I can explain how we can fix it up so it will be a valid XHTML 1.0 Transitional page.
CODE
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<head>
<title>HTML Basic</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<base href="http://localhost">
<link rel="stylesheet" type="text/css" href="/style.css">
</head>
<body bgcolor="#FFFFFF" link="#003399">
<hr>
<form method="post" action="/cgi-bin/script.cgi" name="Script">
<table border=0 cellspacing=0 cellpadding=2 align="center" width=
"95%" summary="Script Entry Form" height="44">
<tr>
<td height="44" colspan="2"><img src="script.png"
width="142" height="28"></td>
</tr>
<tr>
<td width="66%">Fill in the script entries. After you submit your
entry, you will be returned. <br> The * mean required fields.</td>
<td width="34%" align="right"> </td>
</tr>
<tr>
<td colspan="2">
<table border="0" cellspacing="1" cellpadding="4" width="95%"
summary="Time for Input">
<tr>
<td bgcolor="#EFEFEF" width="179">Input* :</td>
<td bgcolor="#EFEFEF" width="460"><input type="text" name=
"inputtext" size="40" maxlength="40"></td>
</tr>
</table>
</td>
</tr>
</table>
</form>
</body>
</HTML>
"http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<head>
<title>HTML Basic</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<base href="http://localhost">
<link rel="stylesheet" type="text/css" href="/style.css">
</head>
<body bgcolor="#FFFFFF" link="#003399">
<hr>
<form method="post" action="/cgi-bin/script.cgi" name="Script">
<table border=0 cellspacing=0 cellpadding=2 align="center" width=
"95%" summary="Script Entry Form" height="44">
<tr>
<td height="44" colspan="2"><img src="script.png"
width="142" height="28"></td>
</tr>
<tr>
<td width="66%">Fill in the script entries. After you submit your
entry, you will be returned. <br> The * mean required fields.</td>
<td width="34%" align="right"> </td>
</tr>
<tr>
<td colspan="2">
<table border="0" cellspacing="1" cellpadding="4" width="95%"
summary="Time for Input">
<tr>
<td bgcolor="#EFEFEF" width="179">Input* :</td>
<td bgcolor="#EFEFEF" width="460"><input type="text" name=
"inputtext" size="40" maxlength="40"></td>
</tr>
</table>
</td>
</tr>
</table>
</form>
</body>
</HTML>
So here's our base HTML file, so how do we go about altering it so that it's a valid XHTML 1.0 Transitional file. Let's start from the top. The first line which is recommended by W3C should be the XML processing instruction line so let's add that.
CODE
<?xml version="1.0" encoding="iso-8859-1"?>
Next we will have to update our Document Type to reflect XHTML 1.0 Transitional. So that means we need to alter it
CODE
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
This is the order it should go in, for HTML it use to be doctype comes first, but for XHTML the processing line is first, then doctype.
Next we alter the <HTML> tag, because XHTML is based on the combination of XML and HTML, the rules really reflect the combination of them both. XML requires that there has to be one root element for a document. In XHTML, all tags should be enclosed within the <HTML> tag, since this is the root element for the document.
Now to alter <HTML>
CODE
<html xmlns="http://www.w3.org/1999/xhtml" lang="EN">
xmlns is the attribute for XML namespace. We associate XHTML documets with it. The attribute lang is for the language of this document.
All XHTML tag elements and attributes should be in lower case. That means <HTML> should be rewritten as <html> and same with the end tag </HTML> should be </html> so now fix that.
Next we will fix up empty tags, in XHTML empty tags should be ended with />
The empty tags in our HTML that needs fixing up are:
<meta> tags to <meta />
<base> tag to <base />
<link> tags to <link />
<hr> tags to <hr />
<img> tags to <img />
<br> tags to <br />
<input> tags to <input />
We should have been finished with it, but I left in some other problems.
Here's some more rules:
XHTML attribute names should also be in lowercase as well as all attribute values should be inside quote marks. The first <table> tag, I left out the quote marks for some of the values, this is still valid HTML but not valid XHTML as XHTML states we should enclose all values in quotes, so do that now.
Another problem, with the first <table> tag, XHTML don't not support the attribute height within the <table> tag, so we will cut it from there and add it onto our next <td> tag, since that supports the height attribute.
One more problem, It is mandatory (absolutely needed) that we have the alt attribute for any image we have on our site, this is to help those who use text-based browsers, or other browsers that help our disability comrades to enjoy our sites too. So now fix our img tag to have an alt attribute, I will be calling it Script Logo, another thing, which is not a problem but should be considered is having a summary for every table we make, again to help others enjoy our site better. It's much appreciated if we do this.
Last but not least, the name attribute is being deprecated (slowly wiped out) and it's replacement is now going to be the attribute id, what we do now is we search for whole words with exact case "name" and replace it with "id", that was our last step.
Now our site should be fixed up and what we now have should look like this
CODE
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="EN">
<head>
<title>HTML Converted to XHTML</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<base href="http://localhost" />
<link rel="stylesheet" type="text/css" href="/style.css" />
</head>
<body bgcolor="#FFFFFF" link="#003399">
<hr />
<form method="post" action="/cgi-bin/script.cgi" id="Script">
<table border="0" cellspacing="0" cellpadding="2" align="center" width=
"95%" summary="Script Entry Form">
<tr>
<td height="44" colspan="2"><img src="script.png" alt=
"Script Logo" width="142" height="28" /></td>
</tr>
<tr>
<td width="66%">Fill in the script entries. After you submit your
entry, you will be returned. <br /> The * mean required fields.</td>
<td width="34%" align="right"> </td>
</tr>
<tr>
<td colspan="2">
<table border="0" cellspacing="1" cellpadding="4" width="95%"
summary="Time for Input">
<tr>
<td bgcolor="#EFEFEF" width="179">Input* :</td>
<td bgcolor="#EFEFEF" width="460"><input type="text" id=
"inputtext" size="40" maxlength="40" /></td>
</tr>
</table>
</td>
</tr>
</table>
</form>
</body>
</html>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="EN">
<head>
<title>HTML Converted to XHTML</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<base href="http://localhost" />
<link rel="stylesheet" type="text/css" href="/style.css" />
</head>
<body bgcolor="#FFFFFF" link="#003399">
<hr />
<form method="post" action="/cgi-bin/script.cgi" id="Script">
<table border="0" cellspacing="0" cellpadding="2" align="center" width=
"95%" summary="Script Entry Form">
<tr>
<td height="44" colspan="2"><img src="script.png" alt=
"Script Logo" width="142" height="28" /></td>
</tr>
<tr>
<td width="66%">Fill in the script entries. After you submit your
entry, you will be returned. <br /> The * mean required fields.</td>
<td width="34%" align="right"> </td>
</tr>
<tr>
<td colspan="2">
<table border="0" cellspacing="1" cellpadding="4" width="95%"
summary="Time for Input">
<tr>
<td bgcolor="#EFEFEF" width="179">Input* :</td>
<td bgcolor="#EFEFEF" width="460"><input type="text" id=
"inputtext" size="40" maxlength="40" /></td>
</tr>
</table>
</td>
</tr>
</table>
</form>
</body>
</html>
Both of these HTML and XHTML should validate as valid Transitional documents using the validator over at W3C, you can check it for yourself.
Well that's it for me, just remember all the keys and rules and you'll have no problem with the transition.
Cheers, MC


