📄 users_guide.html
字号:
<html><head><!-- This document was created from RTF source by rtftohtml version2.7.5 --><title>rtftohtml Users Guide</title></head><body><h1><a name="RTFToC1">rtftohtmlUsers Guide</a></h1>This document contains directions for using the <i>rtftohtml</i> filter. <p>There are two ways that the rtftohtml filter may be used. You can take existingdocuments and translate them to HTML, or write new documents explicitly for theWorld Wide Web. This filter should accomodate both uses.<h1><a name="RTFToC2">AnOverview of rtftohtml</a></h1>rtftohtml reads up RTF format documents and translates them to HTML. Inprocessing text, the filter chooses HTML markup based on three characteristics.These are<ol><li> The destination of the text. Example destinations are header, footer,nootnote, picture.<li> The paragraph style. Paragraph styles are user-definable entities, butsome are pre-defined by the word processing package. For Microsoft Word (on theMacintosh) examples are "Normal" and "heading 1". <li> The text attributes. Examples of text styles are bold, courier, 12point.</ol><p>The filter has built-in rules for dealing with destinations. For paragraph andtext styles, the rules for translation are contained in a file calledhtml-trans. By modifying this file, you can train rtftohtml to perform thecorrect translations for your documents. The most common change that you willneed to make is to add your own paragraph styles to html-trans.<p><p>rtftohtml should produce reasonable HTML output for most documents. Here iswhat you can expect:<ul><li> Your output should appear in a file called "xx.html" where "xx" or"xx.rtf" was your input file name.<li> Bold, italic and underlined text should appear with <b>,<i>and <u> markup<li> Courier font text should appear with <tt> markup<li> Tables will be formatted using <pre> markup (only plain text issupported in tables.)<li> Footnotes will appear in a separate document with hypertext links to them.<li> Table of contents, indexes, headers and footers are discarded.<li> Table of Contents entries and paragraphs with the style "heading 1..6"will generate a hypertext Table of Contents in a separate file. Each table ofcontents entry will link to the correct location in the main document.<li> All paragraph styles use in your document must appear in the file"html-trans". This allows you to create a mapping from any paragraph style toany HTML markup. There are many pre-defined styles in html-trans, including"heading 1..6". (If a paragraph style is not found, a warning will be generatedand the text will be written to the HTML file with no special markup.)<li> Each graphic in your file will be written out to a separate file. Thefilename will be "xxn.ext" where "xx" or "xx.rtf" was your input, "n" is aunique number and "ext" will be either "pict" for Macintosh PICT formatgraphics or "wmf" for Windows Meta-Files format graphics. The HTML file willcreate links to these files, using either "<A HREF=" or "<IMG SRC="links. <b>SINCE most WWW browsers do not understand "wmf" or "pict" formatfiles, the link will be to xxn.gif. </b>This presumes that you will run some<b>other</b> filter to translate your graphic files to gif.<li> Text that is connected with copy/paste-link constructs will generatehypertext links.</ul><h1><a name="RTFToC3">Howit works</a></h1>rtftohtml begins by reading html-trans and the character translation files. Therest of the processing is a loop of reading your RTF file and writing HTML. Ahigh level overview of this loop looks like this:<ol><li> Read the next character. In doing so, the filter also reads all of the RTFmarkup that specifies the destination, paragraph and text styles of the nextcharacter.<li> Process the destination information. Normally, text is destined for the"body" of the document. Sometimes, the text belongs in a header, footnote orfooter. The filter discards any text for headers, footers. For a footnote, thefilter writes the text at the end of the document and generates a link to it.<li> Process any SPECIAL text styles. The filter compares the text styleinformation to see if it matches any entries in the .TMatch table (inhtml-trans). If there is a match and the entry is for "_Discard", "_Literal","_Hot", "_HRef", "_Name" or "_Footnote" then the text will be processedaccordingly. For example, "_Discard" text is discarded and "_Name" text willgenerate an anchor using the text as a name.<li> If the text was not SPECIAL, process the paragraph style. The filter takesthe name of the paragraph style and looks it up in the list of paragraph stylesin html-trans (in the .PMatch table). If the paragraph style is not found inthe table it uses the first entry : "Normal". This entry has a nesting leveland the name of the HTML "paragraph"<a href="Users_Guide_fn.html#fn0">[1]</a>markup to use. Using the HTML paragraph" markup name, the filter (using the.PTag table) knows what tags to generate for the text. <li> If the text was not SPECIAL, process the text styles again. The filtercompares the text style information to see if it matches any entries in the.TMatch table (in html-trans). In this step, it is possible to match more thanone entry. For each matched entry in the .TMatch table, the filter uses theHTML "text" markup name, the filter (using the .TTag table) knows what tags togenerate for the text. </ol><p>Using this process, the filter can generate any HTML markup for any combinationof paragraph style and text style. <p><h1><a name="RTFToC4">Whatabout Graphics?</a></h1>Graphics are imbedded in RTF in either a binary format or an (ASCII) hex dumpof that binary. I have never seen a binary format graphic - I don't think thatthe filter will process binary correctly. It does handle the hex format ofgraphics, by converting the hex back into binary and writing the binary to afile. The file extension is chosen by looking at the original type of thegraphic. The following list shows the file types and their extensions:<p><dl><dt>Macintosh PICT <dd>.pict - also, 256 bytes of nulls are prepended to the graphic. This is toconform to the PICT file format.<dt>Windows Meta-files<dd>.wmf <a href="Users_Guide_fn.html#fn1">[2]</a><dt>Windows Bit-map<dd>.bmp</dl><p>In addition, the filter produces a link to the file containing the graphic.Now, since the above graphic formats are not very portable, the filter assumesthat you will convert these files to something more useful, like GIF. So theformat of the link is:<p><tt><a href="basenameN.ext">Click here for a Picture</a></tt><p><tt></tt>where <ul><li> <tt>basename</tt> is the name of the input document (without the .rtfextension)<li> <tt>N</tt> is a unique number (starting at 1)<li> <tt>ext</tt> is an extension. This defaults to GIF, but can be overiddenwith the -P command line option.<li></ul>You can also change the link to an IMG form. If you specify the -Icommand line option, all links to graphics will be of the form:<p><tt><IMG src="basenameN.ext"></tt><p><p>There is one other special case. If a graphic is encountered when the filter isin the process of generating a link, the IMG form of the link is used evenwithout the -I command line option.<h1><a name="RTFToC5">SpecialProcessing</a></h1>In the following discussion of SPECIAL processing, I will assume that rtftohtmlhas not been customized. If it has, the text styles used to create specialeffects may be different. <h2><a name="RTFToC6">Makinga Named Anchor</a></h2>To make a named anchor, you simply enter the name in the document where youwould like the anchor to appear. Then format the text using Outline and Hidden.Be careful in formatting the text that you format ONLY the name - be carefulnot to format leading and trailing spaces or paragraph marks. As an example, ifthe text - Named Anchor Example - were formatted with Outline and Hidden, itwould produce the HTML output :<p><a name="Named Anchor Example"></a><p><p>To change the formatting that produces named anchors, you need to modify theentry in html-trans that specifies "_Name" formatting.<h2><a name="RTFToC7">Footnote/EndnoteProcessing</a></h2>If your RTF document contains footnotes or endnotes, the filter will place thetext of the footnotein aseparate HTML document. At the footnote referencemark, the filter will generate a hypertext link to the text of the footnote.This works with either automatically numbered footnotes<ahref="Users_Guide_fn.html#fn2">[3]</a>, or user supplied footnote referencemarks<a href="Users_Guide_fn.html#fn3">[+]</a> <h2><a name="RTFToC8">DiscardingUnwanted Text</a></h2>If you have text that you do not want to appear in the HTML output, simplyformat the text as Hidden and Plain (that is, no underline, outline...). <p>If you wish to modify the formatting that discards text, you need to change theentry in html-trans that specifies "_Discard".<b></b><h2><a name="RTFToC9">ImbeddingHTML in a Document</a></h2>Normally, if your RTF document contained the text"<cite>hello</cite>", the translator would output this as:"&lt;cite&gt;hello&lt;/cite&gt;". This ensures that the textwould appear in your HTML output exactly as it appeared in the original RTFdocument. If, however, you want the <cite></cite> to be interpretedas HTML markup, you must format the tags using Hidden and Shadow. The filterwill then send the tags through without translation. <p>When the rtftohtml filter produces HTML markup, it keeps track of the nestinglevel of tags to ensure that you don't get something like<b><cite>hello</b></cite> which would be incorrectmarkup. If you imbed HTML markup in your document, the filter will NOT be awareof it. You must ensure that your markup appears correctly nested.<p>If you wish to modify the formatting for imbedded HTML, you need to change theentry in html-trans that specifies "_Literal".<b></b><p><b></b><h1><a name="RTFToC10">Customizingrtftohtml</a></h1>Some customizations of rtftohtml require a little understanding of how thefilter work, others require a lot. All of the customizations involve editingeither html-trans or one of the character translation files. <h2><a name="RTFToC11">html-transFile Format</a></h2>In html-trans there are four tables. They are .PTag, .TTag, .PMatch and.TMatch. These tables begin with the name (in column one) and continue untilthe next table starts. All blank lines and lines beginning with a '#' arediscarded. '#' lines are typically used for comments. The tables themselves arecomposed of records containing a fixed number of fields which are separated bycommas. The fields are either strings (which should be quoted) integers orbitmasks.<p><h3><a name="RTFToC12">.PTagTable</a></h3>Each entry in the .PTag table describes an HTML paragraph markup. The formatis: <p>.PTag<p>#"name","starttag","endtag","col2mark","tabmark","parmark",allowtext,cannest,DelteCol1,fold,TocStyl<dl><dt><b>name</b><dd>A unique name for this entry. These names are referenced in the .PMatchtable.<dt><b>starttag</b><dd>This string will be output once at the beginning of any text for thismarkup.<dt><b>endtag</b><dd>This string will be output once at the end of any text for this markup.<dt><b>col2mark</b><dd>This string will be output in place of the first tab in every paragraph(used for lists)<dt><b>parmark</b><dd>This string will be output in place of each paragraph mark. (usually<br> or <p>)<dt><b>allowtext</b><dd>If 0, no text markup will be allowed within this markup. (for example<pre> or <h1> don't format well if they contain additional markup.<dt><b>cannest</b><dd>If 1, other paragraph markup will be allowed to nest within this markup.(used for nesting lists)<dt><b>DeleteCol1</b><dd>If 1, all text up to the first tab in a paragraph will be deleted. (used tostrip out bullets that when going to unordered lists (<ul>).<dt><b>fold</b><dd>If 1, the filter will add newlines to the HTML to keep the number ofcharacters in a line to less than 80. For <pre> or <listing>elements, this should be set to 0.<dt><b>TocStyl</b><dd>The TOC level. If greater than 0, the filter will create a Table ofcontents entry for every paragraph using this markup. </dl><p><h4><a name="RTFToC13">Sample.PTag Entries</a></h4><pre>"h1","<h1>\n","</h1>\n","\t","\t","<br>\n",0,0,0,1</pre>Thisis a level 1 heading. The "\n" in the start and end-tag fields forcesa newlinein the HTML markup. Since newlines are ignored in HTML (except in <pre>)it's only effect is to make the HTML output more readable. There is nodifference between the first tab and any other. They both translate to a tabmark. Paragraph marks generate "<br>" followed by a newline (just forlooks). Text markup (like <b>) is not allowed within <h1> text,
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -