Start Organization Navigation Emphasis
Return to Learning HTML
How can you take an existing word processing document and put it up on the Web? The following steps will give you a good basic start. Remember to save your work "early and often" as you make the changes in steps 7 and later.
- Open the document in your word processor.
- Print a copy of the fully formatted file for later reference.
- If there are any embedded graphics, and your word processor has an ability to save as HTML, do so, into a new folder for your HTML pages. With any luck the graphic image files will be created automatically for you. If this doesn't work, you will have to use some other technique to obtain GIF or JPEG files of the images.
- Save it as a text file, in the new folder for your HTML pages, giving it a filename such as "syllabus.htm" (with the quotes if you are using Windows 95 or later, without the quotes on the Macintosh).
- If you are using Word, close the file and Quit out of Word. Open the text file in WordPerfect (Windows or Macintosh), in Wordpad (Windows), in SimpleText (Macintosh OS 9), or in TextEdit (Macintosh OS X).
- Add your standard tags at the top: starting with the <HEAD> tag and continuing through the <BODY BGCOLOR="#FFFFFF"> tag. Be sure to fill in a title between the <TITLE> and </TITLE> tags.
Once you have your first Web page written, you will be able to copy the top tags from it, and will only have to edit the title.
- Go through the document, adding a <P> tag on each blank line (between paragraphs). Do not bother with multiple <P> tags in sequence. In order to improve readability and avoid the need to scroll left and right, you can safely add a hard return to the file anywhere that the original file had a space character.
Save your file from time to time as you proceed through this step, and once more when you finish adding the <P> tags.
- Start Netscape or Internet Explorer; the next step has two alternatives:
- if you have already set a bookmark to a directory listing of your disk drive, choose it and then navigate to your new HTML file;
- otherwise, go to the File menu and choose "Open..." (or "Open Page in Navigator...", and then "Browse..." in some versions) and navigate to the new HTML file you have saved. Once you have the file open, click in the "location" and delete the filename, so that only the folder name and the final slash are present, press the "Return" or "Enter" key to see the directory listing, and then set a bookmark for later use. Use the browser's "Back" toolbar button to return to the new HTML file.
- Examine the file, as displayed by your browser, to make sure that you are on the right track. From time to time during the remaining steps, click on "Reload" after saving your work, and look it over.
- Go through the document, adding a <BR> tag at the end of each line where you do not want the browser to wrap the next line up into the same paragraph, but you also don't want to have a blank line between. Do not use multiple <BR> tags in sequence at this stage.
- Go through the document, adding an <IMG SRC="" ALT="" WIDTH="" HEIGHT="" BORDER=0> tag for each embedded graphic. You will be able to use Netscape to identify which of the graphic images created earlier is which (so that you can put the appropriate filename in for the SRC), and possibly also to tell how many pixels wide and high each is (if your browser doesn't show you the width and height in the title bar, you will need to use some other software to discover the size; for now you can leave the HEIGHT and WIDTH unspecified. Be sure to add appropriate text for the ALT, to be displayed by non-graphic browsers.
In many cases you will want to have <CENTER> and </CENTER> tags surrounding the image tag.
In most cases you will want to have either <P> or <BR> tags before and after the image tag.
- Go through the hardcopy document, finding special characters that cannot be reliably displayed on Web pages:
- "Curly" or "smart" quotes and apostrophes should be replaced by plain ones.
- Bullets as characters in the text should be replaced by the use of <UL> ... </UL> list structures, as discussed in the next section.
- "Em dash" and "En dash" characters should be replaced by hyphens or double hyphens ("-" or "--") with or without space characters preceding and following. The space characters and the doubling of the hyphen are matters of personal taste, but do be consistent.
- Special symbols, such as the ampersand, copyright symbol, and umlauted or accented vowels should be replaced by the appropriate special codes ("&" by "&", "©" by "©", etc.). Be sure not to use any "&#nnn;" codes: if nnn is less than 127 you can use the character itself, and if nnn is greater than 127, then those are platform-dependent and are likely to display differently on Macintosh and Windows machines.
- Tab characters must be removed or replaced by a single space character, or by one or more "non-breaking spaces" (each achieved by a " " code). Initial indent on a paragraph is simply not Web style. If a tab character is used in a hardcopy document to create lined-up columns, then it will probably be replaced by an HTML table.
- Add your standard tags at the bottom, concluding with </BODY> and </HTML>.
Once you have your first Web page written, you will be able to copy the bottom tags from it, and will only have to edit the URL and date.
Return to top
In most cases, the initial formatting described above will be adequate to permit the document to be read, but by adding appropriate tags to identify the structure and organization of the document, you will make your readers' lives simpler.
- Go through the printed document, identifying each section and sub-section, and assigning header levels to their headings. The document title will be level 1, the major sections (the Roman numerals in an outline) will be level 2, their sub-sections (the capital letters in an outline) will be level 3, and so on. This will obviously be easier if the document was originally well-organized. You may discover the need to add some section or sub-section titles that were not part of the original document.
- Go through the document, adding the appropriate level heading tags (e.g., <H2> ... </H2>) for each section and subsection title, based on your notes from the previous step.
- Identify all places in the printed document where you have bulleted or numbered lists. Go through the document, adding the appropriate tags for those list structures, removing the bullet characters or numbers, which will now be generated on your behalf by the browser, as it responds to the tags.
- Identify all places in the printed document where you have itemized lists embedded within paragraphs. For each such case, consider whether or not to use a bulleted or numbered list structure in the HTML version.
- If the items are few, short, and comma-separated in the prose, then you may decide to leave them as-is.
- If the items are many, long, or semi-colon-separated in the prose, then you should probably use the appropriate HTML tags for bulleted or numbered lists.
- If you use the HTML tags, be sure to separate each list item with a <P> if any of the list items is long enough to be likely to wrap. This rule is being followed right here!
- Identify any places in the printed document where you have sections that are indented, such as an extended quote from another source. Use the <BLOCKQUOTE> and </BLOCKQUOTE> tags around each such section.
Return to top
There are two navigational issues to address: getting around from one part of your document to another, and getting between this document and other Web pages. Both of these are dealt with by using the appropriate anchor tags to create links.
- If your Web page is more than three or four screens long, create "shortcut" links at the top leading to each level 2 header. This will require anchor tags on the headers (<A NAME="label"></A><H2> [the header text] </H2>) and on the shortcuts (<A HREF="#label"> [the shortcut text] </A>).
- If you can identify places within your Web page that you or someone else might want to link to directly from other Web pages (that is, not linking to the top of your page, but rather jumping right down to a particular place in the middle of the page), make sure that the appropriate text is preceded by <A NAME="label"></A> tags.
- If you can identify places within your Web page that should have a link to another place within the same page, use the techniques discussed above for shortcuts. A likely case is to provide links back up to the top from the end of each section that you have a shortcut for.
- If you can identify places within your Web page that should have a link to someplace on another Web page, use <A HREF="URL#label"> and </A> tags, with the appropriate URL and label.
Return to top
Once you have the organizational and navigational structure of your document in place, so that any browser will be able to display it effectively for your readers, you should take one last pass through the document to identify places where emphasis needs to be added.
- Identify all the places in the printed document where italics, boldface, all capitals, enlarged font sizes, underlining, or other techniques were used to emphasize portions of the document.
- Those items that were section or sub-section headings have already been taken care of by the header level tags; do not add any additional markup for further emphasis.
- Other items that were boldfaced should be marked up with the <STRONG> and </STRONG> tags around each such item.
- Most of the other emphasized items should also be marked up with the <STRONG> and </STRONG> tags around each such item, even though there may be HTML tags that would cause some browsers to emulate more precisely on-screen the effect your word processor produced on paper.
- If you used italics for the title of a book in the hardcopy document, you may want to use <CITE> and </CITE> tags around each such item, which will be rendered on-screen as italics by graphical browsers. Beware that italics are difficult to read on-screen, especially on Macintosh systems. Consider using quotes (") or <STRONG> and </STRONG> tags instead.
- If you used underlining in the hardcopy document, that should usually be emphasized by either <STRONG> and </STRONG> tags or by <CITE> and </CITE> tags. Underlining Web page text risks confusion with links.
- Be wary of extended sections of text with emphasis. Even a single sentence will strike many people as shouting, especially if done with all capital letters.
- Extended sections that need to be emphasized can be set off with <BLOCKQUOTE> and </BLOCKQUOTE> tags around the section, and perhaps the first or last sentence with <STRONG> and </STRONG> tags.
Return to Learning HTML
Dick Piccard revised this file (http://www.ohiou.edu/pagemasters/class/convert.html) on January 3, 2006.
Please E-Mail comments or suggestions to "email@example.com".