📄 roget_components.w
字号:
% This file is part of the Stanford GraphBase (c) Stanford University 1993@i boilerplate.w %<< legal stuff: PLEASE READ IT BEFORE MAKING ANY CHANGES!@i gb_types.w\def\title{ROGET\_\,COMPONENTS}\def\<#1>{$\langle${\rm#1}$\rangle$}\prerequisite{GB\_\,ROGET}@* Strong components. This demonstration program computes thestrong components of GraphBase graphs derived from Roget's Thesaurus,using a variant of Tarjan's algorithm [R. E. Tarjan, ``Depth-first@^Tarjan, Robert Endre@>search and linear graph algorithms,'' {\sl SIAM Journal on Computing\/\bf1} (1972), 146--160]. We also determine the relationshipsbetween strong components.Two vertices belong to the same strong component if and only if theyare reachable from each other via directed paths.We will print the strong components in ``reverse topological order'';that is, if |v| is reachable from~|u| but |u| is not reachablefrom~|v|, the strong component containing~|v| will be listed beforethe strong component containing~|u|.Vertices from the |roget| graph are identified both by name and bycategory number.@d specs(v) (filename? v-g->vertices+1L: v->cat_no), v->name /* category number and category name */@ We permit command-line options in \UNIX/ style so that a variety ofgraphs can be studied:The user can say `\.{-n}\<number>', `\.{-d}\<number>', `\.{-p}\<number>',and/or `\.{-s}\<number>' to change the default values of the parametersin the graph |roget(n,d,p,s)|. Or `\.{-g}\<filename>' to change thegraph itself.@^UNIX dependencies@>@p#include "gb_graph.h" /* the GraphBase data structures */#include "gb_roget.h" /* the |roget| routine */#include "gb_save.h" /* |restore_graph| */@h@#@<Global variables@>@;main(argc,argv) int argc; /* the number of command-line arguments */ char *argv[]; /* an array of strings containing those arguments */{@+Graph *g; /* the graph we will work on */ register Vertex *v; /* the current vertex of interest */ unsigned long n=0; /* the desired number of vertices (0 means infinity) */ unsigned long d=0; /* the minimum distance between categories in arcs */ unsigned long p=0; /* 65536 times the probability of rejecting an arc */ long s=0; /* the random number seed */ char *filename=NULL; /* external graph substituted for |roget| */ @<Scan the command-line options@>; g=(filename? restore_graph(filename): roget(n,d,p,s)); if (g==NULL) { fprintf(stderr,"Sorry, can't create the graph! (error code %ld)\n", panic_code); return -1; } printf("Reachability analysis of %s\n\n",g->id); @<Perform Tarjan's algorithm on |g|@>; return 0; /* normal exit */}@ @<Scan the command-line options@>=while (--argc) {@^UNIX dependencies@> if (sscanf(argv[argc],"-n%lu",&n)==1) ; else if (sscanf(argv[argc],"-d%lu",&d)==1) ; else if (sscanf(argv[argc],"-p%lu",&p)==1) ; else if (sscanf(argv[argc],"-s%ld",&s)==1) ; else if (strncmp(argv[argc],"-g",2)==0) filename=argv[argc]+2; else { fprintf(stderr,"Usage: %s [-nN][-dN][-pN][-sN][-gfoo]\n",argv[0]); return -2; }}@ Tarjan's algorithm is inherently recursive. We will implementthe recursion explicitly via linked lists, instead of using \CEE/'s runtimestack, because some computer systemsbog down in the presence of deeply nested recursion.Each vertex goes through three stages during the algorithm: First it is``unseen''; then it is ``active''; finally it becomes ``settled,'' when ithas been assigned to a strong component.The data structures that represent the current state of the algorithmare implemented by using five of the utility fields in each vertex:|rank|, |parent|, |untagged|, |link|, and |min|. We will consider each ofthese in turn.@ First is the integer |rank| field, which is zero when a vertex is unseen.As soon as the vertex is first examined, it becomes active and its |rank|becomes and remains nonzero. Indeed, the $k$th vertex to become activewill receive rank~$k$. When a vertex finally becomes settled, its rankis reset to infinity.It's convenient to think of Tarjan's algorithm as a simple adventuregame in which we want to explore all the rooms of a cave. Passageways betweenthe rooms allow one-way travel only. When we comeinto a room for the first time, we assign a new number to that room;this is its rank. Later on we might happen to enter the same roomagain, and we will notice that it has nonzero rank. Then we'll be ableto make a quick exit, saying ``we've already been here.'' (The extracomplexities of computer games, like dragons that might need to bevanquished, do not arise.)@d rank z.I /* the |rank| of a vertex is stored in utility field |z| */@<Glob...@>=long nn; /* the number of vertices that have been seen */@ The active vertices will always form an oriented tree, whose arcs area subset of the arcs in the original graph. A tree arc from |u| to~|v|will be represented by |v->parent==u|. Every active vertex has aparent, which is usually another active vertex; the only exception isthe root of the tree, whose |parent| is |NULL|.In the cave analogy, the ``parent'' of room |v| is the room we were inimmediately before entering |v| the first time. By following parentpointers, we will be able to leave the cave whenever we want.As soon as a vertex becomes settled, its |parent| field changessignificance. Then |v->parent| is set equal to the uniquerepresentative of the strong component containing vertex~|v|. Thustwo settled vertices will belong to the same strong component if and onlyif they have the same |parent|.@d parent y.V /* the |parent| of a vertex is stored in utility field |y| */@ All arcs in the original directed graph are explored systematically duringa depth-first search. Whenever we look at an arc, we tag it so thatwe won't need to explore it again. In a cave, for example, we mightmark each passageway between rooms once we've tried to go through it.The algorithm doesn't actually place a tag on its |Arc| records; instead,each vertex |v| has a pointer |v->untagged| that leads to allhitherto-unexplored arcs from~|v|. The arcs of the list that appearbetween |v->arcs| and |v->untagged| are the ones already examined.@d untagged x.A /* the |untagged| field points to an |Arc| record, or |NULL| */ @ The algorithm maintains two special stacks: |active_stack| containsall the currently active vertices, and |settled_stack| contains all thecurrently settled vertices. Each vertex has a |link| field that pointsto the vertex that is next lower on its stack, or to |NULL| if the vertex isat the bottom. The vertices on |active_stack| always appear in increasingorder of rank from bottom to top.@d link w.V /* the |link| field of a vertex occupies utility field |w| */@<Glob...@>=Vertex * active_stack; /* the top of the stack of active vertices */Vertex * settled_stack; /* the top of the stack of settled vertices */@ Finally there's a |min| field, which is the tricky part that makeseverything work. If vertex~|v| is unseen or settled, its |min| field isirrelevant. Otherwise |v->min| points to the active vertex~|u|of smallest rank having the following property:Either |u==v| or there is a directed path from |v| to |u| consisting ofzero or more mature tree arcs followed by a single non-tree arc.What is a tree arc, you ask. And what is a mature arc? Good questions. At themoment when arcs of the graph are tagged, we classify them either as treearcs (if they correspond to a new |parent| link in the tree of activenodes) or non-tree arcs (otherwise). A tree arc becomes mature when itis no longer on the path from the root to the current vertex beingexplored. We also say that a vertex becomes mature when it isno longer on that path. All arcs from a mature vertex have been tagged.We said before that every vertex is initially unseen, then active, andfinally settled. With our new definitions, we see further that every arc startsout untagged, then it becomes either a non-tree arc or a tree arc. In thelatter case, the arc begins as an immature tree arc and eventually matures.Just believe these definitions, for now. All will become clear soon.@d min v.V /* the |min| field of a vertex occupies utility field |v| */@ Depth-first search explores a graph by systematically visiting allvertices and seeing what they can lead to. In Tarjan's algorithm, aswe have said, the active vertices form an oriented tree. One of thesevertices is called the current vertex.If the current vertex still has an arc that hasn't been tagged, wetag one such arc and there are two cases: Either the arc leads toan unseen vertex, or it doesn't. If it does, the arc becomes a treearc; the previously unseen vertex becomes active, and it becomes thenew current vertex. On the other hand if the arc leads to a vertexthat has already been seen, the arc becomes a non-tree arc and thecurrent vertex doesn't change.Finally there will come a time when the current vertex~|v| has nountagged arcs. At this point, thealgorithm might decide that |v| and all its descendants form a strongcomponent. Indeed, this condition turns out to be true if and only if
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -