⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 hashgen.html

📁 产生哈希函数的代码
💻 HTML
📖 第 1 页 / 共 4 页
字号:
<html lang="en"><head><title>Hashed Lookup Genarator for Static Keysets</title><meta http-equiv="Content-Type" content="text/html"><meta name=description content="Hashed Lookup Genarator for Static Keysets"><meta name=generator content="makeinfo 4.1"><link href="http://texinfo.org/" rel=generator-home></head><body><h1>Hashed Lookup Genarator for Static Keysets</h1><p><hr>Node:<a name="Top">Top</a>,Next:<a rel=next href="#Copying">Copying</a>,Previous:<a rel=previous href="#dir">(dir)</a>,Up:<a rel=up href="#dir">(dir)</a><br><h2>hashgen: Hashed Lookup Genarator for Static Keysets</h2><ul><li><a href="#Copying">Copying</a>:     How you can copy and share <code>hashgen</code>. <li><a href="#Overview">Overview</a>:    What does it do? <li><a href="#Invoking">Invoking</a>:    Program's invocation. <li><a href="#Input">Input</a>:       Input file format. <li><a href="#Example">Example</a>:     A realistic example. <li><a href="#Tips">Tips</a>:        How to obtain better results. <li><a href="#Bugs">Bugs</a>:        Known bugs and limitations.<p></p><li><a href="#Acknowledgements">Acknowledgements</a>: </ul><p><hr>Node:<a name="Copying">Copying</a>,Next:<a rel=next href="#Overview">Overview</a>,Previous:<a rel=previous href="#Top">Top</a>,Up:<a rel=up href="#Top">Top</a><br><h2>GNU GENERAL PUBLIC LICENSE</h2><p>This program is free software; you can redistribute itand/or modify it under the terms of the<a href="http://www.gnu.org/copyleft/gpl.html">GNU General Public Licencse</a>as published by the Free Software Foundation; either version 2 of theLicense, or (at your option) any later option.<p><hr>Node:<a name="Overview">Overview</a>,Next:<a rel=next href="#Invoking">Invoking</a>,Previous:<a rel=previous href="#Copying">Copying</a>,Up:<a rel=up href="#Top">Top</a><br><h2>Overview</h2><p>The <code>hashgen</code> utility generates the source of a lookup functionfor a given static keyset (reserved words of a programminglanguage, assembler mnemonics, XML-based language elements,etc.) Every keyset's entry contains a key (either string or binary)and arbitrary user-defined atributes associated with that key. The keys must be unique.<p>The generated function will find any keyword in constant timeusing exactly <em>one</em> comparison. A typical lookup may takeas little as 100ns on an average desktop machine.<p>The source code can be generated in C++, ANSI C, or K&amp;R C.<p>The <code>hashgen</code> utility does basically the same thing and uses the samebasic algorithm as the <code>gperf</code> utility by Douglas C. Schmidt(see <a href="#Acknowledgements">Acknowledgements</a>).<p>While <code>gperf</code> is a great piece of software, I found it a bitdifficult to use. The major points addressed by <code>hashgen</code> are these:<ul><li>With <code>gperf</code> one has to select the character key positionsby hand, which may be a daunting task for large keysets withlong, highly redundant keys. Just selecting <em>all</em> positions,may yield an unnecessary complex hash function (and will not necessarilyresolve collisions).<p><code>Hashgen</code> will automatically select a minimal set of positionsthat is enough to distinguish the keys of a given keyset. </ul><ul><li>With <code>gperf</code>, the hash table of a keyset with twoor more keys that consist of the same characters and differ only in theirorder will always have collisions (so <code>gperf</code> cannot generatea perfect hash function for, say, Bourne shell keywords with its<code>if</code>/<code>fi</code> and <code>case</code>/<code>esac</code> pairs.)<p><code>Hashgen</code> will always generate a perfect hash functionfor such keysets (but see <a href="#Bugs">Bugs</a>). </ul><ul><li>It's not easy to choose a good set of options if you are not familiarwith the underlying algorithm. Even if you are, it may take morethan a few tries to get satisfactory results (that is, a colisionlesshash function and a hash table of a reasonable size.)<p>With <code>hashgen</code> you don't have to care about the algorithmicdetails (although there are options that allow you to finetune theresults). </ul><ul><li><code>Gperf</code>'s input file format is not very intuitive (especially if you are not familiar with lex/yacc) and is not self-contained. There are two (interdependent) sources of input information--theinput file and the command-line options.<p>With <code>hashgen</code>, all options can be specified in the inputfile.  Its format is comprehensible and one can easily tailor oneof the samples packaged with <code>hashgen</code> even without readingany documentation (<a href="#Example%20Input">Example Input</a>).  Any option still can bespecified on the command-line. results). </ul><ul><li>Another useful feature of <code>hashgen</code> is generation of thelookup test source that both tests the generated lookup functionand measures the average lookup time. </ul><ul><li><code>Hashgen</code> allways retains the original order of the keys. So in addition to hashed lookups you can also traverse the keys in awell defined order, or do binary searches to find the <em>closest</em>key if the key are sorted. For C++ this functionality can beeasily added in a class derived from the generated class. </ul><p><hr>Node:<a name="Invoking">Invoking</a>,Next:<a rel=next href="#Specifying%20Options">Specifying Options</a>,Previous:<a rel=previous href="#Overview">Overview</a>,Up:<a rel=up href="#Top">Top</a><br><h2>Invocation</h2><p>Synopsis:<br><pre>hashgen [<var>options</var>]<small>...</small> [<var>input-file</var>]</pre><p><var>options</var> are expected in either standard short or GNU-style long format.<p><var>input-file</var> contains the keylist, options, and C declarations. If <var>input-file</var> is not specified, the standard input is read. Only one input file can be handled at a time.<p>The generated code is written to the standard output or to the filesspecified with <code>-o</code>, <code>-c</code>, and <code>-h</code> options.<ul><li><a href="#Specifying%20Options">Specifying Options</a>: <li><a href="#Options">Options</a>: </ul><p><hr>Node:<a name="Specifying%20Options">Specifying Options</a>,Next:<a rel=next href="#Options">Options</a>,Previous:<a rel=previous href="#Invoking">Invoking</a>,Up:<a rel=up href="#Invoking">Invoking</a><br><h3>Specifying Options</h3><p>An option can be specified in either the <code>[options]</code> sectionof the input file or as a command-line option. The latter maybe specified in either short or GNU-style long form. The long form of a command-line option has the same name asin the <code>[options]</code> section of the input file with twodashes prepended.<p>Example. A fragment of an input file like this<br><pre>[options]size=3header=keylookup.h<small>...</small></pre><p>is equivalent to the following command line fragment:<br><pre>./hashgen --size=3 --header=keylookup.h <small>...</small></pre><p>or, in the short form,<br><pre>./hashgen -N3 -h keylookup.h <small>...</small></pre><p>A Boolean option <code>foo</code> can be turned on and off like this<br><pre>foo=yesfoo=no</pre><p>or like this<br><pre>foo=1foo=0</pre><p>or just<br><pre>foono-foo</pre><p>All of the above forms can be used on the command-line withtwo preceding dashes, e.g. <code>--no-foo</code>.<p>For a short form of a Boolean option, say <code>i</code>,<code>-i</code> and <code>-i+</code> turn <code>i</code> on,<code>-i-</code> turns it off.<p>Short Boolean options can be combined; the last option in a clusterof options can have non-Boolean argument, e.g. <code>-zi-l C++</code>, will turn <code>z</code> on,<code>i</code> off, and <code>l</code> will have the argument <code>C++</code>. The blank between a short non-Boolean command-line option and itsargument is not required.<p>Options specified in the input file override the options specifiedon the command-line. That is, if in the <code>[options]</code> sectionof the input file you specified <code>output=foo</code>, but on thecommand-line <code>--output=bar</code>, the output will still go to<code>foo</code>, no warnings will be issued.<p><hr>Node:<a name="Options">Options</a>,Next:<a rel=next href="#Input">Input</a>,Previous:<a rel=previous href="#Specifying%20Options">Specifying Options</a>,Up:<a rel=up href="#Invoking">Invoking</a><br><h3>Options</h3><p>This section list all program's options.  All Boolean options aregiven in their non-default form.  E.g., <code>zero-term</code> is on bydefault, so it's listed here as <code>---no-zero-term</code> or <code>-z-</code>.<dl><br><dt><code>--output=FILE</code><dt><code>-o FILE</code><dd>The name of the output source file. By default, all the output goesto stdout. If the <code>output</code> option is specified, the generatedlookup code goes to the specified file.<p>In order to separate the header and the implementation, use<code>header</code> and <code>source</code> options instead.<br><dt><code>--header=HEADER</code><dt><code>--source=SOURCE</code><dt><code>-h HEADER</code><dt><code>-c SOURCE</code><dd>If you want to separate the lookup interface and its implementation,you must specify the name of the header file with <code>-h</code> option,and the name of the implementation file with <code>-c</code> option.<p>If you specify, say, only <code>-h</code>, the implementation sourcewill go to the file specified with <code>-o</code> if it's given, orto the standard output if it's not.<br><dt><code>--language=LANG</code><dt><code>-l LANG</code><dd>Specifies the lookup source language. Valid values are"ANSI C" (or just "C"), "K&amp;R C" (or "KR"), and "C++". The default value is "ANSI C".<br><dt><code>--namespace-name=NAME</code><dt><code>-n NAME</code><dd>The name of the namespace the generated lookup class will be placed in. If the value is empty (default), the global namespace is used. The value of the option is ignored for languages other than "C++".<br><dt><code>--lookup-func-name=NAME</code><dt><code>-f NAME</code><dd>Name of the lookup function. The default value is "lookup". The lookup function has the following signature:<br><pre>struct entry* lookup(const char* key, size_t len);</pre><br><dt><code>--entry-struct-name=NAME</code><dt><code>-e NAME</code><dd>The name of the key table entry struct. The default value is "entry".<br><dt><code>--key-name=NAME</code><dt><code>-k NAME</code><dd>The key field name. The default value is "name".<br><dt><code>--lookup-struct-name=NAME</code><dt><code>-s NAME</code><dd>The name of the lookup structure (class). The default value is "lookup_struct".<p>Even if the selected language is C, all the lookup tables are bundledinto a struct with this name.<br><dt><code>--lookup-data-name=NAME</code><dt><code>-d NAME</code><dd>The name of the lookup data (instance). The default value is "lookup_data".<br><dt><code>--key-table-name=NAME</code><dd>The name of the key table. The default value is "key". The key table is an array of keys and attributes.<br><dt><code>--weight-table-name=NAME</code><dd>The name of the weight table. The default value is "weight".<br><dt><code>--hash-table-name=NAME</code><dd>The name of the hash table. The default value is "hash".<br><dt><code>--length-table-name=NAME</code><dd>The name of the key length table. The default value is "length". The length table is generated only for binary keys (unlessall the keys are of the same length).<br><dt><code>--collision-table-name=NAME</code><dd>The name of the collision table. The default value is "collision". The collision table is generated only if the hash function is not perfect(there are two or more keys with the same hash value.)<br><dt><code>--test=FILE</code><dt><code>-T FILE</code><dd>Generate a test source file with the given name. An executable compiled from the test source will both test the generatedlookup function, and mesure the average lookup time.<p>If the <code>header</code> option was specified, the generated test sourcewill include the lookup header (and you will have to link the test programwith the lookup implementation). Otherwise, it will contain a <em>copy</em>of the lookup source, and the test executable may be compiledfrom the test source alone.<br><dt><code>--inline-lookup-func</code><dt><code>-I</code><dd>Make the lookup function inline.<p>The <code>inline</code> prefix will be <code>#ifdef</code>ed, it will expand to<code>inline</code> for C++, <code>__inline__</code> for GNU, or,if <code>INLINE</code> macro is defined (define it in the <code>[declarations]</code>section), to <code>INLINE</code>.<br><dt><code>--static-lookup-data</code><dt><code>-t</code><dd>Hide the lookup tables inside the lookup function. If the selectedlanguage is C++, the option is ignored. For C++ the lookup tables

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -