Computers in Libraries
Vol. 20, No. 1 • January 2000 
• FEATURE • 
Make Your Web Site Healthier with an HTML Code Checkup 
by Ole Vind 


"The size and complexity of a Web site can quickly grow beyond the point where checking every page manually for performance obstacles is possible." 
Web site administration includes a wide range of tasks, from configuring hardware and server software to updating content to dealing with user feedback. As Web-related technologies evolve, we are given the opportunity to create increasingly advanced Web sites that offer functionality like dynamic and personalized content. While fine-tuning and optimizing database-driven Web applications requires in-depth knowledge about programming and other technical tools, plain HTML code can easily be improved, even by non-techies.

In this article I’ll comment on some performance indicators and provide hints to assist Webmasters in the ongoing process of maintaining and optimizing HTML code. I’ll also give some examples of useful programs and services that can be found on the Internet; please refer to the sidebar for links to them.
 

HTML and Its Effects on Web Site Performance
A healthy Web site is one that’s current, fast, and reliable. It should provide an intuitive navigation interface and conform to current standards, and it should be easily available to the target audience. Such matters should, of course, be considered at the planning and design phase of any Web site, but they are just as important to keep in mind when it comes to maintaining and improving it. Regular checkups are important.

Let us take a look at Web site performance from a user’s perspective. Performance can be viewed according to the following HTML-dependent indicators:

The size and complexity of a Web site can quickly grow beyond the point where checking every page manually for performance obstacles is possible. The good news is that there are tools available to help you make your page load faster, to assist in verifying that your links are not broken, and to assure that visitors will see your site the way it was planned. You can use these tools to perform automated HTML checkups.
 

HTML Utilities That Can Diagnose Ailments
Many HTML editors include site management functionality that can check your page for syntax errors and broken links. Dreamweaver (Macromedia), FrontPage (Microsoft), and HomeSite (Allaire) are examples of popular Windows HTML editors that provide such features.

Other tools have been developed for the specific purpose of checking a Web site for common errors, and you may find them more convenient to use. NetMechanic is an example of a Web-based tool that generates a report on the current health of your Web site. (In the near future I expect to see many more such tools offering a useful service for free—or for a modest charge directly proportional to the level of use.) Direct NetMechanic to a starting URL; from there it will “crawl” your Web site and scan pages for common errors that can have an impact on performance. The output is a clearly laid out report that gives a quick overview of the Web site’s status, with suggestions for improvement.

The World Wide Web Consortium (W3C), led by the Web’s “inventor,” Tim Berners-Lee, is a vendor-neutral organization that consolidates and recommends specifications and standards for HTML. (The current recommendation from W3C is to use HTML version 4.) The W3C’s HTML Validator Service is an example of a specialized tool that focuses on your HTML code’s conformity to current standards. Let’s not forget that HTML is a non-proprietary language that’s constantly being improved to meet the demands for increased functionality.

In addition to the W3C’s recommended HTML tags, specific tags that introduce new functionality have been contributed by a number of vendors. Some of those tags and functionalities are supported only by a limited number of browsers and may or may not be incorporated into future W3C specifications. Indicators suggest that the most popular Web browsers right now are Microsoft’s Internet Explorer and Netscape Navigator/Communicator, but many more are available. (At the time of this writing, more than 30 different browsers were identified by BrowserWatch.) When you validate your code against a specific HTML version, you ensure that your code is correctly structured with all the proper tags according to that version, and that the widest range of browsers will interpret and display the code correctly.

Ensuring current and future portability of HTML documents is another good reason for weeding out non-W3C-recommended tags from your code. In the future, HTML editors and browsers may be a lot more conservative in their interpretation of HTML code. By using only W3C-recommended versions of HTML, odds are that your documents will be portable between a wider range of applications, both now and in the future.

If your site is on an intranet, you may not be able to take advantage of Internet-based applications; you may need special tools instead. Linkbot is an off-line utility (for Windows only) that offers extended functionality like detailed reporting on more than 50 potential problems, graphical presentations, and scheduling to allow for automated site testing on a regular basis.

Commercial vendors aren’t the only toolmakers. The HTML Shrinker is an example of a freeware utility that can decrease the physical size of your HTML files by removing all characters not strictly necessary for displaying the page in a browser (e.g., spaces and comments).
 

Some Common Ailments That You Can Cure Easily
Following are some examples of areas where Web site performance typically can be improved:

Speed: Optimizing the Use of Images. Size does matter! On an intranet, you may have the luxury of unlimited space and bandwidth, but for any public Internet site you must consider the large number of people who use a modem connection. Even if you have no immediate reason for being concerned about network bandwidth, it is good practice to optimize your Web site for best performance at all times. If you are doing a good job as Webmaster, chances are that traffic to your site will be increasing in the future anyway, so be prepared.
Tools to Make Your 
Web Site Work Better
The tools and services mentioned here are examples of what currently can be found on the Internet. Use your favorite search tool or Internet directory to scan the Internet for similar resources. 

Bobby: 
http://www.cast.org/bobby
A free Web-based tool that analyzes Web pages for their accessibility to people with disabilities. Also available for download. 

HTML Shrinker: 
http://pico.i-us.com
Off-line tool for compressing HTML pages by removing unnecessary elements like comments and spaces. Freeware for Windows. 

HTML Validator: 
http://www.htmlvalidator.com
HTML Validator is for Windows 95/98/NT. A 50-validation trial version is available. 

Linkbot: 
http://www.tetranetsoftware.com/
products/linkbot.htm
Off-line tool for finding errors and areas to improve on your Web site (Windows 95/98/NT). A 15-day trial version is available. 

NetMechanic: 
http://www.netmechanic.com
A Web-based tool for checking your Web site for common errors. 

W3C HTML Validation Service: 
http://validator.w3.org
A Web-based tool for validating your HTML code. 

Webmonkey Browser Chart: 
http://www.hotwired.com/webmonkey/
reference/browser_chart
A chart that tells which browsers support which features. 

Web Site Garage: 
http://www.websitegarage.com
A Web-based tool for checking your Web site for common errors. 

Make sure to size all images correctly and to specify the size in the <IMG> tag, using the “height” and “width” attributes so as to allow the browser to format the entire page without waiting for the images to download. While this action doesn’t change the overall load time for the page, the user will see text and links faster, and as a result the perceived wait time is shorter. (If you merely specify an image size smaller than the actual size, the browser will still have to download the full-sized image—it will just show up smaller—and thus no load time increase is gained.) Always remember to include an alternative text in the ALT=”image name” attribute. That text will show up instead of the image if the browser cannot display images, or until the image has been downloaded, giving the user an inkling of what is to come.

If a page contains several images, consider using thumbnails. A thumbnail is a small, low-resolution version of an image that links to the full-sized version. It gives users a “sneak preview” to help them decide whether to download the full-sized image.

No image shows up in a better resolution than 72 dpi on a typical computer monitor. So unless your images are meant to be downloaded and printed in high resolution, you can safely reduce the resolution to 72 dpi using an image editor (e.g., Paint Shop Pro from JASC Software). Reducing the number of colors used in images can also help speed up the download time, and it is good practice to aim at using only the 216 colors in the “Web-safe” palette. The so-called Web-safe palette is a set of colors that will appear similarly across all user workstations.

Integrity: Verifying Your Links. Everybody knows how frustrating it is to click on a dead link. (Of course there are no broken links on the Web site you are taking care of!) As your Web site grows and its structure gets more convoluted, you may find it a demanding task just to make sure all navigation links in your site are working. Choosing a suitable tool to help you locate broken links will greatly ease your job as a Webmaster. Of course you will constantly monitor all links on your Web site, but do give users an easy means of reporting problems (like e-mail to the Webmaster).

Portability: Validating Your HTML Code. By including the following character string on the first line in the HTML document, you specify the HTML nature of a Web document:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.0//EN”>

The “-//W3C//DTD HTML 4.0//EN” portion specifies that this document is designed to adhere to the W3C HTML version 4. Use this tag to let the HTML validator know which HTML version you want to validate against. You settle on an HTML version by determining the level of functionality you require and by considering the audience you want to attract to your Web site (i.e., what type and version of browsers do you want your Web site to support?). Consult with the Webmonkey Browser Chart to find out which browser supports which features.
 

Maintaining Optimum Health
Keeping a Web site current and “healthy” is an ongoing process. No matter how much time you dedicate to maintenance and development, there always seems to be another part of the Web site that could be improved. Many factors will have an impact on the overall performance, some of those being beyond the Webmasters’ control (such as network and hardware performance). However, fine-tuning your HTML code with regard to speed, integrity, and portability is likely to result in an instant performance improvement. And as with our own health, regular checkups can catch problems before they get too bad.

Finding and using the proper diagnostic tools will increase your productivity and will ease your job as Webmaster. Even if you don’t have a budget for expensive commercial software solutions, you can still find resources on the Internet that will help you optimize your Web site performance.
 
 

Ole Vind holds an M.I.L.S.-equivalent degree from The Royal School of Library and Information Science in Copenhagen, Denmark. He is working as an IT professional designing and supporting Web sites as a consultant with CGI in Toronto, Canada. His e-mail address is ole.vind@cgi.ca.
 
 

References

BrowserWatch: http://browserwatch.internet.com/browsers.html

Powell, Thomas A. HTML programmer’s reference, by Thomas A. Powell and Dan Whitworth. Berkeley, Calif.: Osborne/McGraw-Hill, 1998.

StatMarket: http://www.statmarket.com/SM?c=Browsers


• Table of Contents Computers In Libraries Home Page