isomorphism-impl-v2.w

来自「Boost provides free peer-reviewed portab」· W 代码 · 共 1,068 行 · 第 1/3 页

W
1,068
字号
\documentclass[11pt]{report}%\input{defs}\usepackage{math}\usepackage{jweb}\usepackage{lgrind}\usepackage{times}\usepackage{fullpage}\usepackage{graphicx}\newif\ifpdf\ifx\pdfoutput\undefined   \pdffalse\else   \pdfoutput=1   \pdftrue\fi\ifpdf  \usepackage[              pdftex,              colorlinks=true, %change to true for the electronic version              linkcolor=blue,filecolor=blue,pagecolor=blue,urlcolor=blue              ]{hyperref}\fi\ifpdf  \newcommand{\stlconcept}[1]{\href{http://www.sgi.com/tech/stl/#1.html}{{\small \textsf{#1}}}}  \newcommand{\bglconcept}[1]{\href{http://www.boost.org/libs/graph/doc/#1.html}{{\small \textsf{#1}}}}  \newcommand{\pmconcept}[1]{\href{http://www.boost.org/libs/property_map/#1.html}{{\small \textsf{#1}}}}  \newcommand{\myhyperref}[2]{\hyperref[#1]{#2}}  \newcommand{\vizfig}[2]{\begin{figure}[htbp]\centerline{\includegraphics*{#1.pdf}}\caption{#2}\label{fig:#1}\end{figure}}\else  \newcommand{\myhyperref}[2]{#2}  \newcommand{\bglconcept}[1]{{\small \textsf{#1}}}  \newcommand{\pmconcept}[1]{{\small \textsf{#1}}}  \newcommand{\stlconcept}[1]{{\small \textsf{#1}}}  \newcommand{\vizfig}[2]{\begin{figure}[htbp]\centerline{\includegraphics*{#1.eps}}\caption{#2}\label{fig:#1}\end{figure}}\fi\newcommand{\code}[1]{{\small{\em \textbf{#1}}}}\newcommand{\isomorphic}{\cong}\begin{document}\title{An Implementation of Isomorphism Testing}\author{Jeremy G. Siek}\maketitle\section{Introduction}This paper documents the implementation of the \code{isomorphism()}function of the Boost Graph Library.  The implementation was by JeremySiek with algorithmic improvements and test code from Douglas Gregorand Brian Osman.  The \code{isomorphism()} function answers thequestion, ``are these two graphs equal?''  By \emph{equal} we meanthe two graphs have the same structure---the vertices and edges areconnected in the same way. The mathematical name for this kind ofequality is \emph{isomorphism}.More precisely, an \emph{isomorphism} is a one-to-one mapping of thevertices in one graph to the vertices of another graph such thatadjacency is preserved. Another words, given graphs $G_{1} =(V_{1},E_{1})$ and $G_{2} = (V_{2},E_{2})$, an isomorphism is afunction $f$ such that for all pairs of vertices $a,b$ in $V_{1}$,edge $(a,b)$ is in $E_{1}$ if and only if edge $(f(a),f(b))$ is in$E_{2}$.Both graphs must be the same size, so let $N = |V_1| = |V_2|$. Thegraph $G_1$ is \emph{isomorphic} to $G_2$ if an isomorphism existsbetween the two graphs, which we denote by $G_1 \isomorphic G_2$.In the following discussion we will need to use several notions fromgraph theory. The graph $G_s=(V_s,E_s)$ is a \emph{subgraph} of graph$G=(V,E)$ if $V_s \subseteq V$ and $E_s \subseteq E$.  An\emph{induced subgraph}, denoted by $G[V_s]$, of a graph $G=(V,E)$consists of the vertices in $V_s$, which is a subset of $V$, and everyedge $(u,v)$ in $E$ such that both $u$ and $v$ are in $V_s$.  We usethe notation $E[V_s]$ to mean the edges in $G[V_s]$.In some places we express a function as a set of pairs, so the set $f= \{ \pair{a_1}{b_1}, \ldots, \pair{a_n}{b_n} \}$means $f(a_i) = b_i$ for $i=1,\ldots,n$.\section{Exhaustive Backtracking Search}\label{sec:backtracking}The algorithm used by the \code{isomorphism()} function is, atfirst approximation, an exhaustive search implemented viabacktracking.  The backtracking algorithm is a recursive function. Ateach stage we will try to extend the match that we have found so far.So suppose that we have already determined that some subgraph of $G_1$is isomorphic to a subgraph of $G_2$.  We then try to add a vertex toeach subgraph such that the new subgraphs are still isomorphic to oneanother. At some point we may hit a dead end---there are no verticesthat can be added to extend the isomorphic subgraphs. We thenbacktrack to previous smaller matching subgraphs, and try extendingwith a different vertex choice. The process ends by either finding acomplete mapping between $G_1$ and $G_2$ and return true, or byexhausting all possibilities and returning false.We consider the vertices of $G_1$ for addition to the matched subgraphin a specific order, so assume that the vertices of $G_1$ are labeled$1,\ldots,N$ according to that order. As we will see later, a goodordering of the vertices is by DFS discover time.  Let $G_1[k]$ denotethe subgraph of $G_1$ induced by the first $k$ vertices, with $G_1[0]$being an empty graph. We also consider the edges of $G_1$ in aspecific order. We always examine edges in the current subgraph$G_1[k]$ first, that is, edges $(u,v)$ where both $u \leq k$ and $v\leq k$. This ordering of edges can be acheived by sorting the edgesaccording to number of the larger of the source and target vertex.Each step of the backtracking search examines an edge $(u,v)$ of $G_1$and decides whether to continue or go back. There are three cases toconsider:\begin{enumerate}\item $i \leq k \Land j \leq k$. Both $i$ and $j$ are in $G_1[k]$.  Wecheck to make sure the $(f(i),f(j)) \in E_2[S]$.\item $i \leq k \Land j > k$. $i$ is in the matched subgraph $G_1[k]$,but $j$ is not. We are about to increment $k$ try to grow the matchedsubgraph to include $j$. However, first we need to finalize our checkof the isomorphism between subgraphs $G_1[k]$ and $G_2[S]$.  At thispoint we are guaranteed to have seen all the edges to and from vertex $k$(because the edges are sorted), and in previous steps we have checkedthat for each edge incident on $k$ in $G_1[k]$ there is a matchingedge in $G_2[S]$.  However we have not checked that for each edgeincident on $f(k)$ in $E_2[S]$, there is a matching edge in $E_1[k]$(we need to check the ``only if'' part of the ``if and only if'').Therefore we scan through all the edges $(u,v)$ incident on $f(k)$ andmake sure that $(f^{-1}(u),f^{-1}(v)) \in E_1[k]$. Once this check hasbeen performed, we add $f(k)$ to $S$, we increment $k$ (so now $k=j$),and then try assigning the new $k$ to any of the eligible vertices in$V_2 - S$. More about what ``eligible'' means later.\item $i > k \Land j \leq k$. This case will not occur due to the DFSnumbering of the vertices. There is an edge $(i,j)$ so $i$ must beless than $j$.\item $i > k \Land j > k$. Neither $i$ or $j$ is in the matchedsubgraph $G_1[k]$. This situation only happens at the very beginningof the search, or when $i$ and $j$ are not reachable from any of thevertices in $G_1[k]$. This means the smaller of $i$ and $j$ must bethe root of a DFS tree. We assign $r$ to any of the eligible verticesin $V_2 - S$, and then proceed as if we were in Case 2.\end{enumerate}@d Match function@{bool match(edge_iter iter){if (iter != ordered_edges.end()) {    ordered_edge edge = *iter;    size_type k_num = edge.k_num;    vertex1_t k = dfs_vertices[k_num];    vertex1_t u;    if (edge.source != -1) // might be a ficticious edge        u = dfs_vertices[edge.source];    vertex1_t v = dfs_vertices[edge.target];    if (edge.source == -1) { // root node        @<$v$ is a DFS tree root@>    } else if (f_assigned[v] == false) {        @<$v$ is an unmatched vertex, $(u,v)$ is a tree edge@>    } else {        @<Check to see if there is an edge in $G_2$ to match $(u,v)$@>    }} else     return true;return false;} // match()@}The basic idea will be to examine $G_1$ one edge at a time, trying tocreate a vertex mapping such that each edge matches one in $G_2$.  Weare going to consider the edges of $G_1$ in a specific order, so wewill label the edges $0,\ldots,|E_1|-1$.At each stage of the recursion westart with an isomorphism $f_{k-1}$ between $G_1[k-1]$ and a subgraphof $G_2$, which we denote by $G_2[S]$, so $G_1[k-1] \isomorphicG_2[S]$. The vertex set $S$ is the subset of $V_2$ that correspondsvia $f_{k-1}$ to the first $k-1$ vertices in $G_1$.We also order the edges of $G_1$We try to extend the isomorphism by finding a vertex $v \in V_2 - S$that matches with vertex $k$. If a matching vertex is found, we have anew isomorphism $f_k$ with $G_1[k] \isomorphic G_2[S \union \{ v \}]$.\begin{tabbing}IS\=O\=M\=O\=RPH($k$, $S$, $f_{k-1}$) $\equiv$ \\\>\textbf{if} ($k = |V_1|+1$) \\\>\>\textbf{return} true \\\>\textbf{for} each vertex $v \in V_2 - S$ \\\>\>\textbf{if} (MATCH($k$, $v$)) \\\>\>\>$f_k = f_{k-1} \union \pair{k}{v}$ \\\>\>\>ISOMORPH($k+1$, $S \union \{ v \}$, $f_k$)\\\>\>\textbf{else}\\\>\>\>\textbf{return} false \\\\ISOMORPH($0$, $G_1$, $\emptyset$, $G_2$)\end{tabbing}The basic idea of the match operation is to check whether $G_1[k]$ isisomorphic to $G_2[S \union \{ v \}]$. We already know that $G_1[k-1]\isomorphic G_2[S]$ with the mapping $f_{k-1}$, so all we need to dois verify that the edges in $E_1[k] - E_1[k-1]$ connect vertices thatcorrespond to the vertices connected by the edges in $E_2[S \union \{v \}] - E_2[S]$. The edges in $E_1[k] - E_1[k-1]$ are all theout-edges $(k,j)$ and in-edges $(j,k)$ of $k$ where $j$ is less thanor equal to $k$ according to the ordering.  The edges in $E_2[S \union\{ v \}] - E_2[S]$ consists of all the out-edges $(v,u)$ andin-edges $(u,v)$ of $v$ where $u \in S$.\begin{tabbing}M\=ATCH($k$, $v$) $\equiv$ \\\>$out_k \leftarrow \forall (k,j) \in E_1[k] - E_1[k-1] \Big( (v,f(j)) \in E_2[S \union \{ v \}] - E_2[S] \Big)$ \\\>$in_k \leftarrow \forall (j,k) \in E_1[k] - E_1[k-1] \Big( (f(j),v) \in E_2[S \union \{ v \}] - E_2[S] \Big)$ \\\>$out_v \leftarrow \forall (v,u) \in E_2[S \union \{ v \}] - E_2[S] \Big( (k,f^{-1}(u)) \in E_1[k] - E_1[k-1] \Big)$ \\\>$in_v \leftarrow \forall (u,v) \in E_2[S \union \{ v \}] - E_2[S] \Big( (f^{-1}(u),k) \in E_1[k] - E_1[k-1] \Big)$ \\\>\textbf{return} $out_k \Land in_k \Land out_v \Land in_v$ \end{tabbing}The problem with the exhaustive backtracking algorithm is that thereare $N!$ possible vertex mappings, and $N!$ gets very large as $N$increases, so we need to prune the search space. We use the pruningtechniques described in\cite{deo77:_new_algo_digraph_isomorph,fortin96:_isomorph,reingold77:_combin_algo}that originated in\cite{sussenguth65:_isomorphism,unger64:_isomorphism}.\section{Vertex Invariants}\label{sec:vertex-invariants}One way to reduce the search space is through the use of \emph{vertexinvariants}. The idea is to compute a number for each vertex $i(v)$such that $i(v) = i(v')$ if there exists some isomorphism $f$ where$f(v) = v'$. Then when we look for a match to some vertex $v$, we onlyneed to consider those vertices that have the same vertex invariantnumber. The number of vertices in a graph with the same vertexinvariant number $i$ is called the \emph{invariant multiplicity} for$i$.  In this implementation, by default we use the out-degree of thevertex as the vertex invariant, though the user can also supply thereown invariant function. The ability of the invariant function to prunethe search space varies widely with the type of graph.As a first check to rule out graphs that have no possibility ofmatching, one can create a list of computed vertex invariant numbersfor the vertices in each graph, sort the two lists, and then comparethem.  If the two lists are different then the two graphs are notisomorphic.  If the two lists are the same then the two graphs may beisomorphic.Also, we extend the MATCH operation to use the vertex invariants tohelp rule out vertices.\section{Vertex Order}A good choice of the labeling for the vertices (which determines theorder in which the subgraph $G_1[k]$ is grown) can also reduce thesearch space. In the following we discuss two labeling heuristics.\subsection{Most Constrained First}Consider the most constrained vertices first.  That is, examinelower-degree vertices before higher-degree vertices. This reduces thesearch space because it chops off a trunk before the trunk has achance to blossom out. We can generalize this to use vertexinvariants. We examine vertices with low invariant multiplicitybefore examining vertices with high invariant multiplicity.\subsection{Adjacent First}The MATCH operation only considers edges when the other vertex alreadyhas a mapping defined. This means that the MATCH operation can onlyweed out vertices that are adjacent to vertices that have already beenmatched. Therefore, when choosing the next vertex to examine, it isdesirable to choose one that is adjacent a vertex already in $S_1$.\subsection{DFS Order, Starting with Lowest Multiplicity}For this implementation, we combine the above two heuristics in thefollowing way. To implement the ``adjacent first'' heuristic we applyDFS to the graph, and use the DFS discovery order as our vertexorder. To comply with the ``most constrained first'' heuristic weorder the roots of our DFS trees by invariant multiplicity.\section{Implementation}The following is the public interface for the \code{isomorphism}function. The input to the function is the two graphs $G_1$ and $G_2$,mappings from the vertices in the graphs to integers (in the range$[0,|V|)$), and a vertex invariant function object. The output of thefunction is an isomorphism $f$ if there is one. The \code{isomorphism}function returns true if the graphs are isomorphic and falseotherwise. The invariant parameters are function objects that computethe vertex invariants for vertices of the two graphs.  The\code{max\_invariant} parameter is to specify one past the largestinteger that a vertex invariant number could be (the invariantsnumbers are assumed to span from zero to the number).  Therequirements on type template parameters are described below in the``Concept checking'' part.@d Isomorphism function interface@{template <typename Graph1, typename Graph2, typename IsoMapping,           typename Invariant1, typename Invariant2,          typename IndexMap1, typename IndexMap2>bool isomorphism(const Graph1& G1, const Graph2& G2, IsoMapping f,                  Invariant1 invariant1, Invariant2 invariant2,                  std::size_t max_invariant,                 IndexMap1 index_map1, IndexMap2 index_map2)@}The function body consists of the concept checks followed by a quickcheck for empty graphs or graphs of different size and then constructan algorithm object. We then call the \code{test\_isomorphism} memberfunction, which runs the algorithm.  The reason that we implement thealgorithm using a class is that there are a fair number of internaldata structures required, and it is easier to make these data membersof a class and make each section of the algorithm a memberfunction. This relieves us from the burden of passing lots ofarguments to each function, while at the same time avoiding the evilsof global variables (non-reentrant, etc.).@d Isomorphism function body@{{    @<Concept checking@>    @<Quick return based on size@>    detail::isomorphism_algo<Graph1, Graph2, IsoMapping, Invariant1, Invariant2,         IndexMap1, IndexMap2>         algo(G1, G2, f, invariant1, invariant2, max_invariant, 

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?