📄 isomorphism-impl-v3.w

📁 C++的一个好库。。。现在很流行
💻 W
📖 第 1 页 / 共 3 页
字号:
12 3 下一页
\documentclass[11pt,awpaper]{book}

\usepackage{math}
\usepackage{jweb}
\usepackage[nolineno]{lgrind}
\usepackage{awpaper}
\usepackage{graphicx}

\setlength{\evensidemargin}{-0.25in}% was -0.17in
\setlength{\oddsidemargin}{0in}%was -0.25in
\setlength{\textwidth}{5.625in}%was 5.75in
\setlength{\textheight}{7.75in}
\setlength{\topmargin}{-0.125in}%was -0.25
\addtolength{\topmargin}{-\headheight}
\addtolength{\topmargin}{-\headsep}

\newif\ifpdf
\ifx\pdfoutput\undefined
   \pdffalse
\else
   \pdfoutput=1
   \pdftrue
\fi

\ifpdf
  \usepackage[
              pdftex,
              colorlinks=true, %change to true for the electronic version
              linkcolor=blue,filecolor=blue,pagecolor=blue,urlcolor=blue
              ]{hyperref}
\fi

\ifpdf
  \newcommand{\stlconcept}[1]{\href{http://www.sgi.com/tech/stl/#1.html}{{\small \textsf{#1}}}}
  \newcommand{\bglconcept}[1]{\href{http://www.boost.org/libs/graph/doc/#1.html}{{\small \textsf{#1}}}}
  \newcommand{\pmconcept}[1]{\href{http://www.boost.org/libs/property_map/#1.html}{{\small \textsf{#1}}}}
  \newcommand{\myhyperref}[2]{\hyperref[#1]{#2}}
  \newcommand{\vizfig}[2]{\begin{figure}[htbp]\centerline{\includegraphics*{#1.pdf}}\caption{#2}\label{fig:#1}\end{figure}}
\else
  \newcommand{\myhyperref}[2]{#2}
  \newcommand{\bglconcept}[1]{{\small \textsf{#1}}}
  \newcommand{\pmconcept}[1]{{\small \textsf{#1}}}
  \newcommand{\stlconcept}[1]{{\small \textsf{#1}}}
  \newcommand{\vizfig}[2]{\begin{figure}[htbp]\centerline{\includegraphics*{#1.eps}}\caption{#2}\label{fig:#1}\end{figure}}
\fi

\newcommand{\code}[1]{{\small{\em \textbf{#1}}}}


\newcommand{\isomorphic}{\cong}

\begin{document}

\title{An Implementation of Graph Isomorphism Testing}
\author{Jeremy G. Siek}

\maketitle

% Ideas: use BFS instead of DFS, don't have to sort edges?
% No, you would still have to sort the edges.
%
%Figure~\ref{fig:iso-eg2}.
% 0 0 0 1 1 2 5 6 6 7
% 1 2 3 4 2 4 6 3 7 5
%\vizfig{iso-eg2}{Vertices numbered by BFS discover time. The BFS tree
%edges are the solid lines. Nodes $0$ and $5$ are BFS tree root nodes.}
%
% You could do a modified Dijkstra, where the priority in the queue
% would be the BFS discover time of the target vertex.

% Use w(u,v) = |Adj[u] \intersect Adj[v]| as an edge invariant.
% Has anyone used edge invariants before?


\section{Introduction}

This paper documents the implementation of the \code{isomorphism()}
function of the Boost Graph Library.  The implementation was by Jeremy
Siek with algorithmic improvements and test code from Douglas Gregor
and Brian Osman.  The \code{isomorphism()} function answers the
question, ``are these two graphs equal?''  By \emph{equal} we mean
the two graphs have the same structure---the vertices and edges are
connected in the same way. The mathematical name for this kind of
equality is \emph{isomorphism}.

More precisely, an \emph{isomorphism} is a one-to-one mapping of the
vertices in one graph to the vertices of another graph such that
adjacency is preserved. Another words, given graphs $G_{1} =
(V_{1},E_{1})$ and $G_{2} = (V_{2},E_{2})$, an isomorphism is a
function $f$ such that for all pairs of vertices $a,b$ in $V_{1}$,
edge $(a,b)$ is in $E_{1}$ if and only if edge $(f(a),f(b))$ is in
$E_{2}$.

The graph $G_1$ is \emph{isomorphic} to $G_2$ if an isomorphism exists
between the two graphs, which we denote by $G_1 \isomorphic G_2$.
Both graphs must be the same size, so let $N = |V_1| = |V_2|$.

In the following discussion we will need to use several more notions
from graph theory. The graph $G_s=(V_s,E_s)$ is a \emph{subgraph} of
graph $G=(V,E)$ if $V_s \subseteq V$ and $E_s \subseteq E$.  An
\emph{induced subgraph}, denoted by $G[V_s]$, of a graph $G=(V,E)$
consists of the vertices in $V_s$, which is a subset of $V$, and every
edge $(u,v)$ in $E$ such that both $u$ and $v$ are in $V_s$.  We use
the notation $E[V_s]$ to mean the edges in $G[V_s]$.

\section{Backtracking Search}
\label{sec:backtracking}

The algorithm used by the \code{isomorphism()} function is, at first
approximation, an exhaustive search implemented via backtracking.  The
backtracking algorithm is a recursive function. At each stage we will
try to extend the match that we have found so far.  So suppose that we
have already determined that some subgraph of $G_1$ is isomorphic to a
subgraph of $G_2$.  We then try to add a vertex to each subgraph such
that the new subgraphs are still isomorphic to one another. At some
point we may hit a dead end---there are no vertices that can be added
to extend the isomorphic subgraphs. We then backtrack to previous
smaller matching subgraphs, and try extending with a different vertex
choice. The process ends by either finding a complete mapping between
$G_1$ and $G_2$ and returning true, or by exhausting all possibilities
and returning false.

The problem with the exhaustive backtracking algorithm is that there
are $N!$ possible vertex mappings, and $N!$ gets very large as $N$
increases, so we need to prune the search space. We use the pruning
techniques described in
\cite{deo77:_new_algo_digraph_isomorph,fortin96:_isomorph,reingold77:_combin_algo},
some of which originated in
\cite{sussenguth65:_isomorphism,unger64:_isomorphism}. Also, the
specific backtracking method we use is the one from
\cite{deo77:_new_algo_digraph_isomorph}.

We consider the vertices of $G_1$ for addition to the matched subgraph
in a specific order, so assume that the vertices of $G_1$ are labeled
$1,\ldots,N$ according to that order. As we will see later, a good
ordering of the vertices is by DFS discover time.  Let $G_1[k]$ denote
the subgraph of $G_1$ induced by the first $k$ vertices, with $G_1[0]$
being an empty graph. We also consider the edges of $G_1$ in a
specific order. We always examine edges in the current subgraph
$G_1[k]$ first, that is, edges $(u,v)$ where both $u \leq k$ and $v
\leq k$. This ordering of edges can be acheived by sorting each edge
$(u,v)$ by lexicographical comparison on the tuple $\langle \max(u,v),
u, v \rangle$. Figure~\ref{fig:iso-eg} shows an example of a graph
with the vertices labelled by DFS discover time. The edge ordering for
this graph is as follows:

\begin{tabular}{lccccccccc}
source: &0&1&0&1&3&0&5&6&6\\
target: &1&2&3&3&2&4&6&4&7
\end{tabular}

\vizfig{iso-eg}{Vertices numbered by DFS discover time. The DFS tree
edges are the solid lines. Nodes $0$ and $5$ are DFS tree root nodes.}

Each step of the backtracking search moves from left to right though
the ordered edges. At each step it examines an edge $(i,j)$ of $G_1$
and decides whether to continue to the left or to go back. There are
three cases to consider:

\begin{enumerate}
\item \label{case:1} $i > k$
\item \label{case:2} $i \leq k$ and $j > k$.
\item \label{case:3} $i \leq k$ and $j \leq k$.
\end{enumerate}

\paragraph{Case 1: $i > k$.}
$i$ is not in the matched subgraph $G_1[k]$. This situation only
happens at the very beginning of the search, or when $i$ is not
reachable from any of the vertices in $G_1[k]$.  This means that we
are finished with $G_1[k]$.  We increment $k$ and find a match for it
amongst any of the eligible vertices in $V_2 - S$. We then proceed to
Case 2. It is usually the case that $i$ is equal to the new $k$, but
when there is another DFS root $r$ with no in-edges or out-edges
and if $r < i$ then it will be the new $k$.

\paragraph{Case 2: $i \leq k$ and $j > k$.}
$i$ is in the matched subgraph $G_1[k]$, but $j$ is not. We are about
to increment $k$ to try and grow the matched subgraph to include
$j$. However, first we need to finish verifying that $G_1[k]
\isomorphic G_2[S]$. In previous steps we proved that $G_1[k-1]
\isomorphic G_2[S-\{f(k)\}]$, so now we just need to verify the
extension of the isomorphism to $k$. At this point we are guaranteed
to have seen all the edges to and from vertex $k$ (because the edges
are sorted), and in previous steps we have checked that for each edge
incident on $k$ in $E_1[k]$ there is a matching edge in
$E_2[S]$. However we still need to check the ``only if'' part of the
``if and only if''. So we check that for every edge $(u,v)$ incident
on $f(k)$ there is $(f^{-1}(u),f^{-1}(v)) \in E_1[k]$.  A quick way to
verify this is to make sure that the number of edges incident on $k$
in $E_1[k]$ is the same as the number of edges incident on $f(k)$ in
$E_2[S]$. We create an edge counter that we increment every time we
see an edge incident on $k$ and decrement for each edge incident on
$f(k)$. If the counter gets back to zero we know the edges match up.

Once we have verified that $G_1[k] \isomorphic G_2[S]$ we add $f(k)$
to $S$, increment $k$, and then try assigning $j$ to
any of the eligible vertices in $V_2 - S$. More about what
``eligible'' means below.

\paragraph{Case 3: $i \leq k$ and $j \leq k$.}
Both $i$ and $j$ are in $G_1[k]$.  We check to make sure that
$(f(i),f(j)) \in E_2[S]$ and then proceed to the next edge.


\subsection{Vertex Invariants}
\label{sec:vertex-invariants}

One way to reduce the search space is through the use of \emph{vertex
invariants}. The idea is to compute a number for each vertex $i(v)$
such that $i(v) = i(v')$ if there exists some isomorphism $f$ where
$f(v) = v'$. Then when we look for a match to some vertex $v$, only
those vertices that have the same vertex invariant number are
``eligible''. The number of vertices in a graph with the same vertex
invariant number $i$ is called the \emph{invariant multiplicity} for
$i$.  In this implementation, by default we use the function $i(v) =
(|V|+1) \times \outdegree(v) + \indegree(v)$, though the user can also
supply there own invariant function. The ability of the invariant
function to prune the search space varies widely with the type of
graph.

The following is the definition of the functor that implements the
default vertex invariant. The functor models the
\stlconcept{AdaptableUnaryFunction} concept.

@d Degree vertex invariant functor
@{
template <typename InDegreeMap, typename Graph>
class degree_vertex_invariant
{
    typedef typename graph_traits<Graph>::vertex_descriptor vertex_t;
    typedef typename graph_traits<Graph>::degree_size_type size_type;
public:
    typedef vertex_t argument_type;
    typedef size_type result_type;

    degree_vertex_invariant(const InDegreeMap& in_degree_map, const Graph& g)
        : m_in_degree_map(in_degree_map), m_g(g) { }

    size_type operator()(vertex_t v) const {
        return (num_vertices(m_g) + 1) * out_degree(v, m_g)
            + get(m_in_degree_map, v);
    }
    // The largest possible vertex invariant number
    size_type max() const { 
        return num_vertices(m_g) * num_vertices(m_g) + num_vertices(m_g);
    }
private:
    InDegreeMap m_in_degree_map;
    const Graph& m_g;
};
@}


\subsection{Vertex Order}

A good choice of the labeling for the vertices (which determines the
order in which the subgraph $G_1[k]$ is grown) can also reduce the
search space. In the following we discuss two labeling heuristics.

\subsubsection{Most Constrained First}

Consider the most constrained vertices first.  That is, examine
lower-degree vertices before higher-degree vertices. This reduces the
search space because it chops off a trunk before the trunk has a
chance to blossom out. We can generalize this to use vertex
invariants. We examine vertices with low invariant multiplicity
before examining vertices with high invariant multiplicity.

\subsubsection{Adjacent First}

It only makes sense to examine an edge if one or more of its vertices
has been assigned a mapping. This means that we should visit vertices
adjacent to those in the current matched subgraph before proceeding.

\subsubsection{DFS Order, Starting with Lowest Multiplicity}

For this implementation, we combine the above two heuristics in the
following way. To implement the ``adjacent first'' heuristic we apply
DFS to the graph, and use the DFS discovery order as our vertex
order. To comply with the ``most constrained first'' heuristic we
order the roots of our DFS trees by invariant multiplicity.


\subsection{Implementation of the \code{match} function}


The \code{match} function implements the recursive backtracking,
handling the four cases described in \S\ref{sec:backtracking}.

@d Match function
@{
bool match(edge_iter iter, int dfs_num_k)
{
    if (iter != ordered_edges.end()) {
        vertex1_t i = source(*iter, G1), j = target(*iter, G2);
        if (dfs_num[i] > dfs_num_k) {
           @<Find a match for the DFS tree root $k+1$@>
        }
        else if (dfs_num[j] > dfs_num_k) {
            @<Verify $G_1[k] \isomorphic G_2[S]$ and then find match for $j$@>
        }
        else {
            @<Check to see if $(f(i),f(j)) \in E_2[S]$ and continue@>
        }
    } else 
        return true;
    return false;
}
@}

\noindent Now to describe how each of the four cases is implemented.

\paragraph{Case 1: $i \not\in G_1[k]$.}
We increment $k$ and try to map it to any of the eligible vertices of
$V_2 - S$.  After matching the new $k$ we proceed by invoking
\code{match}. We do not yet move on to the next edge, since we have
not yet found a match for edge, or for target $j$.  We reset the edge
counter to zero.

@d Find a match for the DFS tree root $k+1$
@{
vertex1_t kp1 = dfs_vertices[dfs_num_k + 1];
BGL_FORALL_VERTICES_T(u, G2, Graph2) {
    if (invariant1(kp1) == invariant2(u) && in_S[u] == false) {
        f[kp1] = u;
        in_S[u] = true;
        num_edges_on_k = 0;
        if (match(iter, dfs_num_k + 1));
            return true;
        in_S[u] = false;
12 3 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -