📄 isomorphism-impl-v2.w

📁 C++的一个好库。。。现在很流行
💻 W
📖 第 1 页 / 共 3 页
字号:
12 3 下一页
\documentclass[11pt]{report}

%\input{defs}
\usepackage{math}
\usepackage{jweb}
\usepackage{lgrind}
\usepackage{times}
\usepackage{fullpage}
\usepackage{graphicx}

\newif\ifpdf
\ifx\pdfoutput\undefined
   \pdffalse
\else
   \pdfoutput=1
   \pdftrue
\fi

\ifpdf
  \usepackage[
              pdftex,
              colorlinks=true, %change to true for the electronic version
              linkcolor=blue,filecolor=blue,pagecolor=blue,urlcolor=blue
              ]{hyperref}
\fi

\ifpdf
  \newcommand{\stlconcept}[1]{\href{http://www.sgi.com/tech/stl/#1.html}{{\small \textsf{#1}}}}
  \newcommand{\bglconcept}[1]{\href{http://www.boost.org/libs/graph/doc/#1.html}{{\small \textsf{#1}}}}
  \newcommand{\pmconcept}[1]{\href{http://www.boost.org/libs/property_map/#1.html}{{\small \textsf{#1}}}}
  \newcommand{\myhyperref}[2]{\hyperref[#1]{#2}}
  \newcommand{\vizfig}[2]{\begin{figure}[htbp]\centerline{\includegraphics*{#1.pdf}}\caption{#2}\label{fig:#1}\end{figure}}
\else
  \newcommand{\myhyperref}[2]{#2}
  \newcommand{\bglconcept}[1]{{\small \textsf{#1}}}
  \newcommand{\pmconcept}[1]{{\small \textsf{#1}}}
  \newcommand{\stlconcept}[1]{{\small \textsf{#1}}}
  \newcommand{\vizfig}[2]{\begin{figure}[htbp]\centerline{\includegraphics*{#1.eps}}\caption{#2}\label{fig:#1}\end{figure}}
\fi

\newcommand{\code}[1]{{\small{\em \textbf{#1}}}}


\newcommand{\isomorphic}{\cong}

\begin{document}

\title{An Implementation of Isomorphism Testing}
\author{Jeremy G. Siek}

\maketitle

\section{Introduction}

This paper documents the implementation of the \code{isomorphism()}
function of the Boost Graph Library.  The implementation was by Jeremy
Siek with algorithmic improvements and test code from Douglas Gregor
and Brian Osman.  The \code{isomorphism()} function answers the
question, ``are these two graphs equal?''  By \emph{equal} we mean
the two graphs have the same structure---the vertices and edges are
connected in the same way. The mathematical name for this kind of
equality is \emph{isomorphism}.

More precisely, an \emph{isomorphism} is a one-to-one mapping of the
vertices in one graph to the vertices of another graph such that
adjacency is preserved. Another words, given graphs $G_{1} =
(V_{1},E_{1})$ and $G_{2} = (V_{2},E_{2})$, an isomorphism is a
function $f$ such that for all pairs of vertices $a,b$ in $V_{1}$,
edge $(a,b)$ is in $E_{1}$ if and only if edge $(f(a),f(b))$ is in
$E_{2}$.

Both graphs must be the same size, so let $N = |V_1| = |V_2|$. The
graph $G_1$ is \emph{isomorphic} to $G_2$ if an isomorphism exists
between the two graphs, which we denote by $G_1 \isomorphic G_2$.

In the following discussion we will need to use several notions from
graph theory. The graph $G_s=(V_s,E_s)$ is a \emph{subgraph} of graph
$G=(V,E)$ if $V_s \subseteq V$ and $E_s \subseteq E$.  An
\emph{induced subgraph}, denoted by $G[V_s]$, of a graph $G=(V,E)$
consists of the vertices in $V_s$, which is a subset of $V$, and every
edge $(u,v)$ in $E$ such that both $u$ and $v$ are in $V_s$.  We use
the notation $E[V_s]$ to mean the edges in $G[V_s]$.

In some places we express a function as a set of pairs, so the set $f
= \{ \pair{a_1}{b_1}, \ldots, \pair{a_n}{b_n} \}$
means $f(a_i) = b_i$ for $i=1,\ldots,n$.

\section{Exhaustive Backtracking Search}
\label{sec:backtracking}

The algorithm used by the \code{isomorphism()} function is, at
first approximation, an exhaustive search implemented via
backtracking.  The backtracking algorithm is a recursive function. At
each stage we will try to extend the match that we have found so far.
So suppose that we have already determined that some subgraph of $G_1$
is isomorphic to a subgraph of $G_2$.  We then try to add a vertex to
each subgraph such that the new subgraphs are still isomorphic to one
another. At some point we may hit a dead end---there are no vertices
that can be added to extend the isomorphic subgraphs. We then
backtrack to previous smaller matching subgraphs, and try extending
with a different vertex choice. The process ends by either finding a
complete mapping between $G_1$ and $G_2$ and return true, or by
exhausting all possibilities and returning false.

We consider the vertices of $G_1$ for addition to the matched subgraph
in a specific order, so assume that the vertices of $G_1$ are labeled
$1,\ldots,N$ according to that order. As we will see later, a good
ordering of the vertices is by DFS discover time.  Let $G_1[k]$ denote
the subgraph of $G_1$ induced by the first $k$ vertices, with $G_1[0]$
being an empty graph. We also consider the edges of $G_1$ in a
specific order. We always examine edges in the current subgraph
$G_1[k]$ first, that is, edges $(u,v)$ where both $u \leq k$ and $v
\leq k$. This ordering of edges can be acheived by sorting the edges
according to number of the larger of the source and target vertex.

Each step of the backtracking search examines an edge $(u,v)$ of $G_1$
and decides whether to continue or go back. There are three cases to
consider:

\begin{enumerate}

\item $i \leq k \Land j \leq k$. Both $i$ and $j$ are in $G_1[k]$.  We
check to make sure the $(f(i),f(j)) \in E_2[S]$.

\item $i \leq k \Land j > k$. $i$ is in the matched subgraph $G_1[k]$,
but $j$ is not. We are about to increment $k$ try to grow the matched
subgraph to include $j$. However, first we need to finalize our check
of the isomorphism between subgraphs $G_1[k]$ and $G_2[S]$.  At this
point we are guaranteed to have seen all the edges to and from vertex $k$
(because the edges are sorted), and in previous steps we have checked
that for each edge incident on $k$ in $G_1[k]$ there is a matching
edge in $G_2[S]$.  However we have not checked that for each edge
incident on $f(k)$ in $E_2[S]$, there is a matching edge in $E_1[k]$
(we need to check the ``only if'' part of the ``if and only if'').
Therefore we scan through all the edges $(u,v)$ incident on $f(k)$ and
make sure that $(f^{-1}(u),f^{-1}(v)) \in E_1[k]$. Once this check has
been performed, we add $f(k)$ to $S$, we increment $k$ (so now $k=j$),
and then try assigning the new $k$ to any of the eligible vertices in
$V_2 - S$. More about what ``eligible'' means later.

\item $i > k \Land j \leq k$. This case will not occur due to the DFS
numbering of the vertices. There is an edge $(i,j)$ so $i$ must be
less than $j$.

\item $i > k \Land j > k$. Neither $i$ or $j$ is in the matched
subgraph $G_1[k]$. This situation only happens at the very beginning
of the search, or when $i$ and $j$ are not reachable from any of the
vertices in $G_1[k]$. This means the smaller of $i$ and $j$ must be
the root of a DFS tree. We assign $r$ to any of the eligible vertices
in $V_2 - S$, and then proceed as if we were in Case 2.

\end{enumerate}



@d Match function
@{
bool match(edge_iter iter)
{
if (iter != ordered_edges.end()) {
    ordered_edge edge = *iter;
    size_type k_num = edge.k_num;
    vertex1_t k = dfs_vertices[k_num];
    vertex1_t u;
    if (edge.source != -1) // might be a ficticious edge
        u = dfs_vertices[edge.source];
    vertex1_t v = dfs_vertices[edge.target];
    if (edge.source == -1) { // root node
        @<$v$ is a DFS tree root@>
    } else if (f_assigned[v] == false) {
        @<$v$ is an unmatched vertex, $(u,v)$ is a tree edge@>
    } else {
        @<Check to see if there is an edge in $G_2$ to match $(u,v)$@>
    }
} else 
    return true;
return false;
} // match()
@}






The basic idea will be to examine $G_1$ one edge at a time, trying to
create a vertex mapping such that each edge matches one in $G_2$.  We
are going to consider the edges of $G_1$ in a specific order, so we
will label the edges $0,\ldots,|E_1|-1$.

At each stage of the recursion we
start with an isomorphism $f_{k-1}$ between $G_1[k-1]$ and a subgraph
of $G_2$, which we denote by $G_2[S]$, so $G_1[k-1] \isomorphic
G_2[S]$. The vertex set $S$ is the subset of $V_2$ that corresponds
via $f_{k-1}$ to the first $k-1$ vertices in $G_1$.

We also order the edges of $G_1$



We try to extend the isomorphism by finding a vertex $v \in V_2 - S$
that matches with vertex $k$. If a matching vertex is found, we have a
new isomorphism $f_k$ with $G_1[k] \isomorphic G_2[S \union \{ v \}]$.




\begin{tabbing}
IS\=O\=M\=O\=RPH($k$, $S$, $f_{k-1}$) $\equiv$ \\
\>\textbf{if} ($k = |V_1|+1$) \\
\>\>\textbf{return} true \\
\>\textbf{for} each vertex $v \in V_2 - S$ \\
\>\>\textbf{if} (MATCH($k$, $v$)) \\
\>\>\>$f_k = f_{k-1} \union \pair{k}{v}$ \\
\>\>\>ISOMORPH($k+1$, $S \union \{ v \}$, $f_k$)\\
\>\>\textbf{else}\\
\>\>\>\textbf{return} false \\
\\
ISOMORPH($0$, $G_1$, $\emptyset$, $G_2$)
\end{tabbing}

The basic idea of the match operation is to check whether $G_1[k]$ is
isomorphic to $G_2[S \union \{ v \}]$. We already know that $G_1[k-1]
\isomorphic G_2[S]$ with the mapping $f_{k-1}$, so all we need to do
is verify that the edges in $E_1[k] - E_1[k-1]$ connect vertices that
correspond to the vertices connected by the edges in $E_2[S \union \{
v \}] - E_2[S]$. The edges in $E_1[k] - E_1[k-1]$ are all the
out-edges $(k,j)$ and in-edges $(j,k)$ of $k$ where $j$ is less than
or equal to $k$ according to the ordering.  The edges in $E_2[S \union
\{ v \}] - E_2[S]$ consists of all the out-edges $(v,u)$ and
in-edges $(u,v)$ of $v$ where $u \in S$.

\begin{tabbing}
M\=ATCH($k$, $v$) $\equiv$ \\
\>$out_k \leftarrow \forall (k,j) \in E_1[k] - E_1[k-1] \Big( (v,f(j)) \in E_2[S \union \{ v \}] - E_2[S] \Big)$ \\
\>$in_k \leftarrow \forall (j,k) \in E_1[k] - E_1[k-1] \Big( (f(j),v) \in E_2[S \union \{ v \}] - E_2[S] \Big)$ \\
\>$out_v \leftarrow \forall (v,u) \in E_2[S \union \{ v \}] - E_2[S] \Big( (k,f^{-1}(u)) \in E_1[k] - E_1[k-1] \Big)$ \\
\>$in_v \leftarrow \forall (u,v) \in E_2[S \union \{ v \}] - E_2[S] \Big( (f^{-1}(u),k) \in E_1[k] - E_1[k-1] \Big)$ \\
\>\textbf{return} $out_k \Land in_k \Land out_v \Land in_v$ 
\end{tabbing}

The problem with the exhaustive backtracking algorithm is that there
are $N!$ possible vertex mappings, and $N!$ gets very large as $N$
increases, so we need to prune the search space. We use the pruning
techniques described in
\cite{deo77:_new_algo_digraph_isomorph,fortin96:_isomorph,reingold77:_combin_algo}
that originated in
\cite{sussenguth65:_isomorphism,unger64:_isomorphism}.

\section{Vertex Invariants}
\label{sec:vertex-invariants}

One way to reduce the search space is through the use of \emph{vertex
invariants}. The idea is to compute a number for each vertex $i(v)$
such that $i(v) = i(v')$ if there exists some isomorphism $f$ where
$f(v) = v'$. Then when we look for a match to some vertex $v$, we only
need to consider those vertices that have the same vertex invariant
number. The number of vertices in a graph with the same vertex
invariant number $i$ is called the \emph{invariant multiplicity} for
$i$.  In this implementation, by default we use the out-degree of the
vertex as the vertex invariant, though the user can also supply there
own invariant function. The ability of the invariant function to prune
the search space varies widely with the type of graph.

As a first check to rule out graphs that have no possibility of
matching, one can create a list of computed vertex invariant numbers
for the vertices in each graph, sort the two lists, and then compare
them.  If the two lists are different then the two graphs are not
isomorphic.  If the two lists are the same then the two graphs may be
isomorphic.

Also, we extend the MATCH operation to use the vertex invariants to
help rule out vertices.

\section{Vertex Order}

A good choice of the labeling for the vertices (which determines the
order in which the subgraph $G_1[k]$ is grown) can also reduce the
search space. In the following we discuss two labeling heuristics.

\subsection{Most Constrained First}

Consider the most constrained vertices first.  That is, examine
lower-degree vertices before higher-degree vertices. This reduces the
search space because it chops off a trunk before the trunk has a
chance to blossom out. We can generalize this to use vertex
invariants. We examine vertices with low invariant multiplicity
before examining vertices with high invariant multiplicity.

\subsection{Adjacent First}

The MATCH operation only considers edges when the other vertex already
has a mapping defined. This means that the MATCH operation can only
weed out vertices that are adjacent to vertices that have already been
matched. Therefore, when choosing the next vertex to examine, it is
desirable to choose one that is adjacent a vertex already in $S_1$.

\subsection{DFS Order, Starting with Lowest Multiplicity}

For this implementation, we combine the above two heuristics in the
following way. To implement the ``adjacent first'' heuristic we apply
DFS to the graph, and use the DFS discovery order as our vertex
order. To comply with the ``most constrained first'' heuristic we
order the roots of our DFS trees by invariant multiplicity.



\section{Implementation}

The following is the public interface for the \code{isomorphism}
function. The input to the function is the two graphs $G_1$ and $G_2$,
mappings from the vertices in the graphs to integers (in the range
$[0,|V|)$), and a vertex invariant function object. The output of the
function is an isomorphism $f$ if there is one. The \code{isomorphism}
function returns true if the graphs are isomorphic and false
otherwise. The invariant parameters are function objects that compute
the vertex invariants for vertices of the two graphs.  The
\code{max\_invariant} parameter is to specify one past the largest
integer that a vertex invariant number could be (the invariants
numbers are assumed to span from zero to the number).  The
requirements on type template parameters are described below in the
``Concept checking'' part.


@d Isomorphism function interface
@{
template <typename Graph1, typename Graph2, typename IsoMapping, 
          typename Invariant1, typename Invariant2,
          typename IndexMap1, typename IndexMap2>
bool isomorphism(const Graph1& G1, const Graph2& G2, IsoMapping f, 
                 Invariant1 invariant1, Invariant2 invariant2, 
                 std::size_t max_invariant,
                 IndexMap1 index_map1, IndexMap2 index_map2)
@}


The function body consists of the concept checks followed by a quick
check for empty graphs or graphs of different size and then construct
an algorithm object. We then call the \code{test\_isomorphism} member
function, which runs the algorithm.  The reason that we implement the
algorithm using a class is that there are a fair number of internal
data structures required, and it is easier to make these data members
of a class and make each section of the algorithm a member
function. This relieves us from the burden of passing lots of
arguments to each function, while at the same time avoiding the evils
of global variables (non-reentrant, etc.).


@d Isomorphism function body
@{
{
    @<Concept checking@>
    @<Quick return based on size@>
    detail::isomorphism_algo<Graph1, Graph2, IsoMapping, Invariant1, Invariant2, 
        IndexMap1, IndexMap2> 
        algo(G1, G2, f, invariant1, invariant2, max_invariant,
12 3 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -