📄 abstract.plaintex

📁 模拟器提供了一个简单易用的平台
💻 PLAINTEX
字号:
% EXTENDED ABSTRACT DESCRIBING THE STANFORD GRAPHBASE\magnification\magstep1\advance\vsize by 1.5\baselineskip\parskip3pt plus 1pt\font\sc=cmcsc10\def\disleft#1:#2:#3\par{\par\hangindent#1\noindent	 \hbox to #1{#2 \hfill \hskip .1em}\ignorespaces#3\par}\def\TeX{T\hbox{\hskip-.1667em\lower.424ex\hbox{E}\hskip-.125em X}}\def\biba{\par\parindent 40pt\hangindent 60pt}\centerline{\bf The Stanford GraphBase:                A Platform for Combinatorial Computing}\smallskip\centerline{\sl Donald E. Knuth, Stanford University}\noindentA highly portable collection of programs and data is nowavailable to researchers who study combinatorial algorithms and datastructures. All files are in the public domain and usable withonly one restriction: They must not be changed! A~``change file''mechanism allows local customization while the master files stayintact.The programs are intended to be interesting in themselves as examplesof ``literate programming.'' Thus, the Stanford GraphBase can also beregarded as a collection of approximately 30 essays for programmers toenjoy reading, whether or not they are doing algorithmic research. Theprograms are written in {\tt CWEB}, a~combination of \TeX\ and~C thatis easy to use by anyone who knows those languages and easy to read byanyone familiar with the rudiments of~C. (The {\tt CWEB} system isitself portable and in the public domain.)Four program modules constitute the {\it kernel\/} of the GraphBase:{\biba{\sc gb\_$\,$flip} is a portable random number generator;\biba{\sc gb\_$\,$graph} defines standard data structures for graphs andincludes routines for storage allocation;\biba{\sc gb\_$\,$io} reads data files and makes sure they are uncorrupted;\biba{\sc gb\_$\,$sort} is a portable sorting routine for 32-bit keysin linked lists of nodes.}\noindentAll of the other programs rely on {\sc gb\_$\,$graph} and some subsetof the other three parts of the kernel.A dozen or so {\it generator modules\/} construct graphs that are ofspecial interest in algorithmic studies. For example, {\scgb\_$\,$basic} contains 12~subroutines to produce standard graphs,such as the graphs of queen moves on $d$-dimensional rectangularboards with ``wrap-around'' on selected coordinates. Another generatormodule, {\sc gb\_$\,$rand}, produces several varieties ofrandom graphs.Each graph has a unique identifier that allows researchers all overthe world to work with exactly the same graphs, even when those graphsare ``random.'' Repeatable experiments and standard benchmarks willtherefore be possible and widely available.Most of the generator modules make use of {\it data sets}, which theauthor has been collecting for 20~years in an attempt to provideinteresting and instructive examples for some forthcoming books oncombinatorial algorithms ({\sl The Art of Computer Programming},Volumes 4A, 4B, and~4C). For example, one of the data sets is {\ttwords.dat}, a~collection of 5-letter words of English that the authorbelieves is ``complete'' from his own reading experience. Each word isaccompanied by frequency counts in various standard corpuses of text,so that the most common terms can be singled out if desired. {\scgb\_$\,$words} makes a subset of words into a graph by saying that twowords are adjacent when they agree in~4 out of~5 positions. Thus, wecan get from {\tt words} to {\tt graph} in seven steps:\disleft 30pt::{\tt words, wolds, golds, goads, grads, grade, grape,  graph.}\noindentThis is in fact the shortest such chain obtainable from {\ttwords.dat}.A dozen or so {\it demonstration modules\/} are also provided, asillustrations of how the generated graphs can be used. For example,the {\sc ladders} module is an interactive program to construct chainsof 5-letter words like the one just exhibited, using arbitrary subsetsof the data. If we insist on restricting our choices to the 2000 mostcommon words, instead of using the entire collection of about 5700, theshortest path from {\tt words} to {\tt graph} turns out to havelength~20:\disleft 30pt::{\tt words, lords, loads, leads, leaps, leapt, least,}\vskip-5pt\disleft 30pt::{\tt  lease, cease, chase, chose, chore, shore, shone,}\vskip-5pt\disleft 30pt::{\tt phone, prone, prove, grove, grave,grape, graph.}Several variations on this theme have also been implemented. If we considerthe distance between adjacent words to be alphabetic distance, forexample, the shortest path from {\tt words} to {\tt graph} turns outto be \disleft 30pt::{\tt words} (3) {\tt woods} (16) {\tt goods} (14) {\tt goads} (3){\tt grads} (14) {\tt grade} (12) {\tt grape} (3) {\tt graph},\noindenttotal length 65. The {\tt LADDERS} module makes use of another GraphBase module called{\sc gb\_$\,$dijk}, which carries out Dijkstra's algorithm forshortest paths and allows the user to plug in arbitraryimplementations of priority queues so that the performance ofdifferent queuing methods can be compared.The graphs produced by {\sc gb\_$\,$words} are undirected. Othergenerator modules, like {\sc gb\_$\,$roget}, produce directed graphs.Roget's famous {\sl Thesaurus\/} of 1882 classified all concepts into 1022categories, which we can call the vertices of a graph; an arc goesfrom~$u$ to~$v$ when category~$u$ contains a cross reference tocategory~$v$ in Roget's book. A~demonstration module called {\scroget\_$\,$components} determines the strong components of graphsgenerated by {\sc gb\_$\,$roget}. This program is an exposition ofTarjan's algorithm for strong components and topological sorting ofdirected graphs.Similarly, world literature leads to further interesting families of undirectedgraphs viathe {\sc gb\_$\,$books} module. Five data sets {\tt anna.dat}, {\ttdavid.dat}, {\tt homer.dat}, {\tt huck.dat}, and {\tt jean.dat} giveinformation about {\sl Anna Karenina}, {\sl David Copperfield}, {\slThe Iliad}, {\sl Huckleberry Finn}, and {\sl Les Mis\'erables}. Asyou might expect, the characters of each work become the vertices of agraph. Two vertices are adjacent if the corresponding charactersencounter each other, in selected chapters of the book. A~demonstration program called{\sc book\_$\,$components} finds the blocks (i.e., biconnectedcomponents) of these graphs using the elegant algorithm of Hopcroftand Tarjan.Another module, {\sc gb\_$\,$games}, generates graphs based on collegefootball scores. All the games from the 1990 season between America's leading 120teams are recorded in {\tt games.dat}; this data leads to ``cliquey''graphs, because most of the teams belong to leagues and they playevery other team in their league. The overall graph is, however,connected. A~demonstration module called {\sc football} finds longchains of scores, to prove for instance that Stanford might have trouncedHarvard by more than 2000 points if the two teams had met---becauseStanford beat Notre Dame by~5, and Notre Dame beat Air Force by~30,and Air Force beat Hawaii by~24, and \dots~, and Yale beat Harvardby~15. (Conversely, a~similar ``proof'' also ranks Harvard overStanford by more than 2000 points.) No good algorithm is known forfinding the optimum solution to problems like this, so the dataprovides an opportunity for researchers to exhibit better and bettersolutions with better and better techniques as algorithmicprogress is made.The {\sc gb\_$\,$econ} module generates directed graphs based on theflow of money between industries in the US economy. A~variety ofgraphs can be obtained, as the economy can be divided into any number ofsectors from~2 to~79 in this model. A~demonstration program {\sc econ\_$\,$order}attempts to rank the sectors in order from ``suppliers'' to``consumers,'' namely to permute rows and columns of a matrix so as tominimize the sum of entries above the diagonal. A reasonably efficientalgorithm for this problem is known, but it is very complicated. Twoheuristics are implemented for comparison, one ``greedy'' and the other``cautious.'' Greed appears to be victorious, at least in the economic sphere.The highway mileage between 128 North American cities appears in {\ttmiles.dat}, and the {\sc gb\_$\,$miles} module generates a variety ofgraphs from~it. Of special interest is a demonstration module called{\sc miles\_$\,$span}, which computes the minimum spanning trees ofgraphs output by {\sc gb\_$\,$miles}. Four algorithms for minimumspanning trees are implemented and compared, including some that aretheoretically appealing but do not seem to fare so well in practice.An approach to comparison of algorithms called ``mem counting'' isshown in this demonstration to be an easily implementedmachine-independent measure of efficiency that gives a reasonably faircomparison between competing techniques.A generator module called {\sc gb\_$\,$raman} produces ``Ramanujangraphs,'' which are important because of their role as expandergraphs, useful for communication. A~demonstration module called {\scgirth} computes the shortest circuit and the diameter of Ramanujangraphs. Notice that some graphs, like those produced by {\sc gb\_$\,$basic} or{\sc gb\_$\,$raman}, have a rigid mathematical structure; others, likethose produced by {\sc gb\_$\,$roget} or {\sc gb\_$\,$miles}, are more``organic'' in nature. It is interesting and important to testalgorithms on both kinds of graphs, in order to see if there is anysignificant difference in performance.A generator module called {\sc gb\_$\,$gates} produces graphs of logiccircuits. One such family of graphs is equivalent to a simple RISCchip, a~programmable microcomputer with a variable number of registers.Using such a ``meta-network''of gates, algorithms for design automation can be tested for a rangeof varying parameters. A~demonstration module {\sc take\_$\,$risc}simulates the execution of the chip on a sample program. Anothermeta-network of gates will perform parallel multiplication of $m$-bitnumbers by $n$-bit numbers or by an $n$-bit constant; the {\scmultiply} module demonstrates these circuits.Planar graphs are generated by {\sc gb\_$\,$plane}, which includesamong other things an implementation of the best currently knownalgorithm for Delaunay triangulation.Pixel data can lead to interesting bipartite graphs. Leonardo's {\slGioconda\/} is represented by {\tt lisa.dat}, an array of pixels thatis converted into graphs of different kinds by {\sc gb\_$\,$lisa}.A~demonstration routine {\sc assign\_$\,$lisa} solves the assignmentproblem by choosing one pixel in each row and in each column so thatthe total brightness of selected pixels is maximized. Although theassignment problem being solved here has no relevance to artcriticism or art appreciation, it does have great pedagogical value,because there is probably no better way to understand thecharacteristics of a large array of numbers than to perceive the arrayas an image.A~module called {\sc gb\_$\,$save}converts GraphBase graphs to and from an ASCII format thatreadily interfaces with other systems for graph manipulation.For further information see {\sl The Stanford GraphBase}, published byACM Press in 1993. The book could also becalled ``Fun and games with the Stanford GraphBase,'' because thedemonstration programs are great toys to play with. Indeed, the authorfirmly believes that the best serious work is also good fun. Weneedn't apologize if we enjoy doing research.\looseness-1\bye
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -