Web Resources for Web Authors
revised 1 Apr 2006
Copyright © 1997–2008 by Stan Brown, Oak Road Systems
revised 1 Apr 2006
Copyright © 1997–2008 by Stan Brown, Oak Road Systems
Summary: A Web author needs to know HTML and CSS, but it can be hard to separate information from misinformation on the Web. On this page you’ll find helpful links here for all phases of creating your Web site using HTML and CSS.
Contents:
Copying: You’re welcome to print copies of this page for your own use, and to link from your own Web pages to this page. But please don’t make any electronic copies and publish them on your Web page or elsewhere.
There are many fine sites that discuss HTML and Web authoring, far too many to list in any one Web page. What you'll see here are my personal favorites. If you know of a site that does a better job on any particular topic, please let me know.
There are a lot of links in this document. This symbol
marks the ones that I think are absolutely essential for your
tool kit.
“Best viewed in” Browser N or Browser M, if it means anything, means that the
site designer has built in features that are proprietary to a particular
browser, quite likely with security
risks in the shape of
ActiveX and
JavaScript.
It puts the designer’s idea of “cool” above the safety and convenience
of the visitor, and I think that’s exactly backwards.
Would you tell people which brand of TV to use to watch a program, or which brand of DVD player to use to watch a movie? Of course not! When you create content, you follow published standards so that people can use any available device to view it. Why should things be different on the Web?
I say (as every author should): use your favorite browser! No author should have the chutzpah to tell you this site is “best viewed in” their favorite. The goal is to write standard HTML and CSS that should work in all browsers, or at least degrade gracefully — even text-only browsers like Lynx. An extra benefit is that such HTML is usually shorter and simpler than HTML carefully tailored to match a particular browser or collection of browsers.
This page recommends a number of informative sites that share this philosophy.
This section gives you some idea where to begin when you just want to learn HTML and CSS. Don’t be afraid! Though there’s a lot to HTML, an afternoon or evening should be plenty to learn all the HTML you need to get started creating some basic Web pages. Then with another afternoon or evening you can get your feet wet styling your pages with CSS. As you get more comfortable, you’ll start to use some of the reference materials, and refer to them frequently for the syntax and meaning of tags, attributes, and properties.
By the way, you may have heard about XHTML. The X stands for “Extensible”, and XHTML is a reformulation of HTML as an XML application. If that sounds like pure gobbledygook, don’t worry! Though it’s the wave of the future, and the New York Public Library style guide calls for XHTML, few browsers support special XHTML features at this writing (September 2003). For now, my advice is don’t worry about XHTML. Your HTML documents won’t become obsolete for years and years, if ever.
If you do want to write your HTML in a way that will easily adapt to XHTML in the future, your best bet is to write strict HTML4 and validate it frequently. Then it won’t take much to turn your valid HTML4 into valid XHTML. For details, see the Compatibility Guidelines in the XHTML spec.
You can get by with plain vanilla HTML, but most sites these days are styled with colors, borders, and interesting layout, and you probably want that for your pages too. Please resist the temptation to specify fonts, colors, and layout in your HTML! Every aspect of presentation on screen or page is much better done with a style sheet written in CSS ( = Cascading Style Sheets). For a consistent look and feel, create a separate style sheet and link your HTML documents to it.
Here are some general tutorials for beginners:
I recommend you look at all three of those. While there is naturally some overlap, each one covers some unique points.
Once you’re past the beginner level, I can recommend these intermediate-level tutorials:
The Web Design Group
maintains a wealth of material, including the absolutely essential
Web Authoring FAQ.
While that FAQ has not been maintained for a while, it still answers
the questions asked by pretty much very new Web author.
Brian Wilson offers an excellent Cascading Style Sheets FAQ, pitched at about the beginner level.
The alt.html FAQ covers both HTML and CSS issues, and it has been updated more recently than the WDG’s Web Authoring FAQ mentioned above.
It’s a sad fact of life that a Web author has to put a significant amount of effort into getting around browser quirks and bugs. And in a vicious cycle, since a lot of authors wrote pages that relied on known bugs, newer browser versions deliberately emulate those old bugs. As an author you can defend yourself to some extent by writing only valid HTML and CSS, but you need some additional resources:
Several Usenet newsgroups are full of experts who, amazingly enough, will answer your questions for free. But the Net is like God: it helps those who help themselves. Please, before posting a question, check the appropriate FAQs and reference materials; whittle down your problem page to the smallest one that still displays the puzzling behavior; and make sure you’ve validated both your HTML and your CSS. Then pick the right newsgroup:
There are also, as of this writing (September 2003).
seven HTML newsgroups in the
alt.html hierarchy, namely alt.html
itself and six newsgroups alt.html.something. If
you don’t get an answer in the c.i.w.a.* groups, you may want to try
one of the alt.html groups.
Before you spend too much time building your site, or when you decide it’s time for an overhaul, you should give some thought to some Web design guidelines. This section points you toward some of the best.
A Message to Clueless Website Authors is a polemic well worth reading. Avoid the annoyances mentioned there and your site will be better than most.
Jukka Korpela has written a number of thought-provoking articles on authoring for the Web.
Authors tend to spend a lot of time trying to lay out every pixel
of their site. Some even design “for” a particular screen resolution.
Paradoxically, they spend
lots of extra effort to make their pages worse on everybody’s
screen except their own. A better approach is what some call
liquid design, which adapts automatically to different screen
sizes or window sizes.
Do yourself a favor and read
Web
Pages Aren’t Printed on Paper by John Allsopp.
That should start you thinking about all the advantages of a liquid
design, not the least of which that it saves you work. Then read Jukka
Korpela’s Publishing
on the Web Is Different.
While you’re at Jukka’s site, cheeck out his Links Want To Be Links, which talks about non-underlined links, borderless image links, drop-down links, and more.
Make sure that your HTML is viewable in every browser. Remember that not everyone uses Netscape or MSIE, and not everyone’s screen is the same size as yours. If you must use browser-specific tags, make sure that your page still presents its content adequately if the visitor is using a different browser, or just has those features turned off.
Also remember that not everyone “viewing” your pages has normal vision. People may be color blind, unable to read small type, or even completely blind and “viewing” your page via a reader. Ask yourself whether your pages make sense when read aloud, and if all the navigation works for people who can’t see pictures and don’t have JavaScript. (Search engines can’t see pictures and don’t have JavaScript, so following these guidelines will also make your pages easier to find.)
Jakob Nielsen’s Alertbox explores issues of usability that concern authors. See especially his Top Ten New Mistakes of Web Design.
The W3C has published a set of Web Accessibility Guidelines. And Usable Web is a truly massive collections of links to pages covering usability issues.
An automated process, CAST’s Bobby, will test your Web page against most of those guidelines. There’s also a version you can download to test your entire site, or files on your computer, in one go.
Even if you’re writing in English, you’ll need some characters that aren’t on your keyboard. At this point a little terminology is in order.
The Web standard character set is Unicode, with room for 65,536 different characters covering letters in many languages as well as special symbols.
Unicode character numbers 32–126 are the same as US-ASCII, which is basically the characters on US keyboards.
US-ASCII plus Unicode character numbers 160–255 makes up the ISO-8859-1 or Latin-1 character set. Pretty much any browser can cope with these characters, which are listed here.
The MS Windows 1252 character set, Code Page 1252, is the same as ISO-8859-1, except that Windows also defines characters like dashes and curly quotes in positions 128–159. Those Windows characters will look like something else on the Web, when viewed by some Windows users and most non-Windows users. To avoid this, follow the advice in Henry Churchyard’s blessedly short document, The Correct Way to Display “Smart Quotes”, the Trademark Symbol, etc. It’s a good fast reference, but necessarily oversimplified.
In addition to encoding your non-ASCII characters, you
must also be concerned with declaring your document’s character set.
See the oft-cited
Checklist
for HTML Character Coding by Alan Flavell. That page lays out for you exactly how
to declare your character set depending on just what you’re trying to
accomplish.
Please see Unicode, Character Entities, and Numbers below for full information.
Very often, people will see something they like on a Web page and wonder how to get a similar effect on their own page. One easy answer is to look at the source code of the page that you admire. Depending on your browser, you can right-click and select “View Source”, or look for a “View Page Source” selection in the top-of-screen menus. Lynx users would press the “\” (backslash) key.
Contrary to what some people tell you, there is no way to hide the source code. Your browser has to get hold of it to display the page, and once your browser has it it’s on your computer. If you can’t find a way to view the source code live, you can always save the page to your own computer, and then open it using your favorite plain-text editor.
It’s a fine line between learning from someone else and plagiarism or copyright violation. While most people would agree that it’s okay to learn and use a particular trick from someone else’s page, it’s certainly not okay to just lift the entire page, change a few things, and call it yours. And if you do learn something useful from a particular page, it’s a gracious gesture to give the author credit, either by a link on your own page or by an e-mail of thanks.
Here are links to help with two common tasks.
When you want to present large images, visitors’
computers will display your page faster if you don’t have them right
on the page, but instead have thumbnails that link to the big
pictures. For instance, if the actual picture canvas.jpg
is 800×600 pixels, you might create a thumbnail
thcanvas.jpg at 200×150 pixels, which will load
about 16 times faster. (Don’t simply shrink the original image; crop
it to just the area of primary interest first, and shrink
that.)
For linking to an image from a thumbnail, the HTML code is pretty straightforward:
<a href="canvas.jpg"><img src="thcanvas.jpg"
width=200 height=150
alt="painting: Mona Lisa del Mar"></a>
To create the thumbnails, you can use any of the commercial image editors, but in my opinion you can’t beat free.
One of my favorites, for users of MS-Windows 3.x/9x/NT, etc., is the free LView Pro 16. (Its successor, LView Pro 32, knows about long filenames but is not free.) LView Pro can easily create thumbnails of individual images, or contact sheets of multiple images. You can download LView Pro 16 here or here or here. If those sites don’t work, try your favorite search engine and look for “lviewp1b.zip”. (Be prepared to weed through a lot of non-working links in the search results.)
Straight HTML is not very good for presenting math expressions, but you can probably do better than you think. Jukka Korpela has a lot of good ideas for you at Math in HTML (and CSS).
Where you really need two-dimensional math expressions, there are a couple of possibilities for creating the images:
When you eventually publish your site, you’ll want to have it listed on the major search engines so that people looking for your topics will find it. As you write your pages, you want to think ahead and do things right up front.
Three mistakes are very common. Don’t fall into these traps:
The best advice to an author is to let your content speak for itself. Search engines index the text of a page: if your text is clear and uses the terms people commonly use for your topic, your page will come up in people’s searches — and probably higher than you think.
There are some specific things you can do:
Search Engine Watch has lots of good suggestions and explains the reasons behind them. The article Search Engine Placement Tips and its sequel How to Use HTML Meta Tags are particularly helpful.
There’s a big market out there for “Web design” or “Web authoring” programs. A lot of trusting people are using Microsoft Office to create a document and then “save it as HTML”. I don’t believe in these programs. Though I haven’t used them (except Office), I have seen the results, and they can be pretty bad.
Why do I not recommend such programs? Simply because they don’t deliver on their promise of freeing you from learning HTML and CSS.
HTML and CSS are just not that hard, and you don’t even have to shell out money for a book. There are lots of good free tutorials on the Web, and you can learn by example as well.
Many of the Web programs emit invalid
HTML, so some browsers won’t
even display your pages. The programs also tend to emit hugely
bloated HTML: empty <i></i>, lots and lots of
, tables nested six deep, and so on.
Either way you end up having
to edit HTML anyway. Which is a better use of your time, to learn
HTML/CSS plus some proprietary program, or to learn just HTML/CSS?
Finally, the programs sell you the false idea that you can use HTML to control layout down to the last pixel. That’s simply not true, and the more work you spend trying to make it true the worse the results will be. We’re all used to spending time tweaking margins and fonts in our favorite word processor, so it’s natural to expect the same performance from the Web. But in fact the Web operates very differently: the author gives presentation hints in the form of CSS. The user may accept those or not, or substitute a different set of presentation in the form of a user style sheet. This is good for the users because they can set up style sheets with comfortable font sizes and colors; it’s good for the author because you can do a lot less work.
So after all that, what do I recommend? Pick your favorite plain-text editor (not a word processor) and use it to create raw HTML and CSS. If you don’t know how to get a particular effect, find an example and use it; if you can’t find an example then probably the thing can’t be done. Just relax and accept the fact that different browsers will display your content somewhat differently; the good news is that you can spend your time on creating content and not worry about tweaking presentation.
Which editor? It’s a very personal choice. If you don’t do much editing, use the editor that came with your operating system (e.g., Windows Notepad). Once you’ve done your first page, you can create subsequent pages by copying it and editing the copy.
If you do a lot of editing, or just want things like the ability to map keys and store macros, you probably want a better editor, such as the free editor Vim, which is available for virtually every platform. Vim also highlights HTML and CSS syntax, helping you spot coding errors. Vim is well supported in the newsgroup comp.editors (put “Vim” in your subject line).
If you have many pages, say more than a dozen or so, you may begin to find it a burden to keep them all up to date, especially if you want them all to have a similar “look”. For the styling, of course, you just create a single CSS style sheet and link every HTML document to it. But consistent text (wording of menu selections, for instance) can be problematic.
The way I solve this problem is to maintain in each file only the content that is unique, and then use the text-processing program AWK to put in all the boilerplate like the navigation bar and copyright notice. AWK is terrific for repetitive tasks of processing text, and GNU AWK is an excellent free version. Simtel no longer hosts the executables and DOC files for the GNU project; do a Web search for gawk304x.zip.
Whether you’re a new Web author or have been around a while, you really need to validate your HTML and CSS. Most of the time when a page doesn’t look the way you expect, or works in one browser but not another, the problem is invalid coding. Validating should be a routine step, to shake out as many problems as possible.
The
W3C Validation Service will
validate the HTML of
a page on the Web or a file on your computer.
I have passed every page on this site through this validator,
which I recommend heartily for its ease of use. Once your page passes
for the first time, you can include a link on it, and then just access
the link to re-validate any updated version of your page.
(Try this with the image at the right!)
The Web Design Group’s
HTML validator
uses the same engine as W3C, but a slightly newer version. The great
benefit of the WDG’s validator is that it often produces more friendly
error messages, with links to more extensive help.
The Web Design Group also maintains a list of HTML validators.
![]()
The W3C’s CSS
Validation Service will check your CSS, either as a
publicly accessible file or by upload from your computer.
You can also check little snippets of CSS in isolation.
Once you have valid HTML and CSS, they stay valid. But valid links break because authors move pages, change ISPs, or just remove pages from the Web. Fight “link-rot” with these link checkers:
Not validators as such, two related programs are
You can get your own copy of the NSGMLS engine
that the W3C and WDG validators use. Then you can validate your
HTML before you upload it for others to see.
To download NSGMLS from
James Clark’s site, look for
the link “How to get SP”. (The package is called SP, but the validator
program is NSGMLS.) There are compiled versions for DOS/Windows as
well as other platforms. The download file also contains copies of the
documentation that is on the site.
Dave Raggett’s HTML TIDY can clean up a lot of careless syntax errors, offer advice on accessibility, and also format the HTML source code in a variety of layout styles. The program is free. Binaries are available for DOS, UNIX variations, and other operating systems.
The W3C’s CSS Validator is available for download. It’s UNIX oriented; if you know of a DOS or Windows tool comparable to NSGMLS, I’d be pleased to hear about it.
(Jim Dabell kindly posted lengthy Windows installation instructions for the W3C CSS Validator, plus a bug fix, in ciwas on 1 Sep 2003; the article is archived as -cqdnZUnbu9mAc6iRVn-sg@giganews.com at Google.)
If at all possible, test in several browsers, including a modern browser like Mozilla (which is pretty good at following standards), an older version of MSIE or Netscape, and a text-only browser. Don’t expect your pages to look identical in all browsers, but you should expect all the content and navigation to be accessible.
Try turning off JavaScript (which is risky to have turned on anyway), and make sure all navigation and other features still work. Many people run with JavaScript disabled, either out of their own security consciousness or because their IT department sets up the browsers that way. (Even if your site is not aimed at businesses, remember that people do use work computers or public-library computers to view the Web.)
If you’re working on a graphics monitor, try different screen resolutions; also switch between full screen and windowed and check how the page looks when you narrow the browser window.
Temporarily turn off images in your browser. Are ALT texts displayed? Is all the navigation still usable? If you don’t have a non-graphical browser, get hold of Lynx and see how well your content is presented in text mode.
Test the links between your pages. If your links have the form
<a href="instant.htm">link text</a>
then they will work equally well on your computer and after uploading to your Web site. You can also put links to other directories, for example
<a href="../images/coffee.jpg">link text</a>
These are examples of relative URLs, and you can read the full rules in RFC 1808.
Also remember that filenames are case sensitive
on many Web servers, so if the link is to instant.htm
you can’t call the file Instant.htm.
Once you’ve created your pages, to make them available on the Web you need to upload them to your Web host. Your Web host or ISP should be able to help you with the process, but here’s an overview.
index.htm or index.html. (Web page file
names are usually case sensitive.) ftp on the command line,
but will probably do better with the free WS-FTP LE, which is
available from sites like TUCOWS.
Okay, you’ve written your pages to be search-engine friendly.
You could sit back and wait to get noticed. The major search engines trawl the Web periodically, by following external links in sites they know about. So if somebody links to you, you’ll probably pop up in search engines eventually.
But of course you probably want to get listed as soon as possible. The massive Search Engine Watch gives good instructions for the major sites. A good starting point on that site is the article Search Engine Submission Tips.
Ken Churilla has a compact list of the submission addresses for lots of search engines in his page Eureka! Quick URL Submission Page.
Kirwood Inc. is in the business of submitting pages to search engines, but also offers lots of tips for submitting your pages yourself.
Here’s an interesting tool. MSIE lets you install a Google Bar, which among other things lets you know your page’s rank in Google. If you have a better browser, you can get equivalent information (unofficially) at Connaitre son PageRank sans la toolbar. The page is in French, but the window for entering your URL is obvious. When you click OK you’ll see PageRank estimé at the bottom of the results page, or else a box explaining that there is insufficient information and inviting you to try a different page.
This section lists some of the most important reference documents for Web authors.
The official
HTML 4.01 spec is
hosted at the World Wide Web Consortium (W3C).
You may be particularly interested in the list of
changes
between from older versions of HTML.
Many people react to the idea of actually reading a spec the way they would to a spider in their hair. While there’s no substitute for the spec (and if you try it you might find it’s easier than you think), here are two plain-English sites that are pretty comprehensive:
The W3C spec for CSS2 is located
here. Late-model browsers
cover CSS2 pretty well, though Internet Explorer 6 is
marginal.
There’s already a preliminary spec for CSS2.1. But browser coverage for the new features is pretty poor as of this writing (September 2003). What you should look at, though, is the list of changes from CSS2, because several features of CSS2 will be removed in CSS2.1. While you don’t want to use the new features of CSS2.1 yet, you probably don’t want to use the removed features of CSS2 at all.
An excellent and comprehensive plain-language reference on CSS is Brian Wilson’s Index Dot CSS.
Having trouble understanding a complex selector in someone else’s CSS (or your own)? Consult the Selectoracle.
The original proper form of a URL is explained in
RFC 1738.
This RFC answers tricky questions like how to include a username in
telnet links and a subject line in mailto
links. (For the second one, the answer is ,“You can’t.”)
For
relative URLs, like #Usenet and
../images/farfalle.gif, see
RFC 1808.
Draft RFC 2396, “Uniform Resource Identifiers (URI): Generic Syntax”, will revise and replace both of the above.
When you set colors you want to stick to the ones that are most likely to appear as expected, without dithering. Here is a very nice color cube that you can look at from all different angles, even from the inside!
An excellent static view is Doug Jacobson’s RGB Hexadecimal Color Chart, which shows the 216 non-dithering colors all on one page.
The international character set is called Unicode, and it has places for 65,536 characters. For some terminology and brief advice, please see Internationalization and Character Sets, above.
Jukka Korpela has a long Tutorial on Character Code Issues: it doesn’t offer practical advice but does thoroughly explore the issues and terminology so you’ll understand practical advice when it’s offered.
Alan Flavell’s i18n: HTML Character set issues beyond HTML3.2 demystifies authoring issues around various character sets beyond plain US-ASCII. It’s also well worth following his links to other pages at his site.
Probably the
best site for Unicode
matters is maintained by Alan Wood, and one of his most
useful pages is
Using Special
Characters from Windows Glyph List 4 (WGL4) in HTML.
WGL4 is an unofficial subset of Unicode characters. Browser
support for the WGL4 subset is somewhat better than browser support
for Unicode as a whole, and therefore you’ll have fewer problems with
the WGL4 characters than with other Unicode characters.
Even so, you can’t assume that your users will be able to view all the WGL4 characters. That depends on which browser they’re using and which fonts they’ve configured it to use. If you can, it’s probably better to avoid special characters for a while yet, as Jukka Korpela argues in On the Use of Some MS Windows Characters in HTML.
With 65,536 possible characters, finding a particular Unicode character can be a bit daunting. Jukka Korpela has a very handy page How to Find an &#number; Notation for a Character, which gives some general advice and also links to a couple of search tools.
Even if you as an author do everything right, users may not be able to view your pages because they don’t have the necessary fonts, or their browsers aren’t set up as needed, or internationalization isn’t enabled in their copy of Windows.
You can refer them to Alan Flavell’s I18n - Browsers and Fonts. It gives instructions for setting up a system and a browser to view international characters.
This test page by Andreas Prilop is great for checking your settings.
this page: http://oakroadsystems.com/tech/webauthr.htm
Was this page useful? Visit my other Technical Articles. Want to show your appreciation? Please donate a few bucks to the author.