⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 sift.tex

📁 SIFT代码
💻 TEX
📖 第 1 页 / 共 2 页
字号:
% file:        sift.tex% author:      Andrea Vedaldi% description: SIFT code manual% AUTORIGHTS\documentclass{article}\usepackage{visionlab,xspace,tabularx}\usepackage[margin=2cm]{geometry}\usepackage{graphics}\usepackage[usenames]{color}%\usepackage{showkeys}\usepackage{hyperref}\newcommand{\x}{\mathbf{x}}\title{An implementation of SIFT detector and descriptor}\author{Andrea Vedaldi\\ University of California -- VisionLab}\date{}\definecolor{codecolor}{rgb}{0.1,0.8,0.0} \newcommand{\Matlab}{{\sc Matlab}\xspace}\let\oldtt=\tt\renewcommand{\tt}{\oldtt\color{codecolor}}\begin{document}\maketitle{}% this will add all LaTeX labels as PDF destinations% Used in combination with Hyperref, needs to be after \begin{document}\let\oldlabel=\label\renewcommand{\label}[1]{%{\pdfdest name {#1} fitbh}% xyz\oldlabel{#1}%}\tableofcontents{}% ------------------------------------------------------------------------------\section{Introduction}\label{sift.introduction}% ------------------------------------------------------------------------------These notes describe an implementation of the Scale-Invariant Transform Feature (SIFT) interest point detector and descriptor \cite{lowe04distinctive}. This implementation is designed to produce results close to Lowe's original implementation.\footnote{See {\tt http://www.cs.ubc.ca/\~ lowe/keypoints/}} The SIFT detector and descriptor are discussed in some depth in the paper  \cite{lowe04distinctive}. Here we describe the interface to our implementation and, in the appendix, some technical details.% ------------------------------------------------------------------------------ \section{User reference: the {\tt sift} function}\label{sift.user}% ------------------------------------------------------------------------------The SIFT detector and the SIFT descriptor are invoked by means of the function {\tt sift}, which provides a unified interface to both.\begin{example}[Invocation]The following lines run the SIFT detector and descriptor on the image {\tt data/test.jpg}.\begin{verbatim}  I = imread('data/test.png') ;  I = double(rgb2gray(I)/256) ;  [frames,descriptors] = sift(I, 'Verbosity', 1) ;\end{verbatim}The pair option-value \verb$'Verbosity',1$ causes the function to print a detailed progress report.\end{example}The {\tt sift} function returns a $4\times K$ matrix {\tt frames} containing the SIFT frames and a $128 \times K$ matrix {\tt descriptors} containing their descriptors. Each frame is characterized by four numbers which are in order $(x_1,x_2)$ for the center of the frame, $\sigma$ for its scale and $\theta$ for its orientation. The coordinates $(x_1,x_2)$ are relative to the upper-left corner of the image, which is assigned coordinates $(0,0)$, and may be fractional numbers (sub-pixel precision). The scale $\sigma$ is the smoothing level at which the frame has been detected. This number can also be interpreted as size of the frame, which is usually visualized as a disk of radius $6\sigma$. Each descriptor is a vector describing coarsely the appearance of the image patch corresponding to the frame (further details are discussed in Appendix~\ref{sift.internals.descriptor}). Typically this vector has dimension 128, but this number can be changed by the user as described later.Once frames and descriptors of two images $I_1$ and $I_2$ have been computed, {\tt siftmatch} can be used to estimate the pairs of matching features. This function uses Lowe's method to discard ambiguous matches~\cite{lowe04distinctive}. The result is a $2\times M$ matrix,  each column of which is a pair $(k_1,k_2)$ of indices of corresponding SIFT frames.\begin{example}[Matching]Let us assume that the images {\tt I1} and {\tt I2} have been loaded and processed as in the previous example. The code\begin{verbatim}  matches = siftmatch(descriptors1, descriptors2) ;\end{verbatim}stores in {\tt matches} the matching pairs, one per column.\end{example}The package provides some ancillary functions; you can\begin{itemize}\item use {\tt plotsiftframe} to plot SIFT frames;\item use {\tt plotsiftdescriptor} to plot SIFT descriptors;\item use {\tt plotmatches} to plot feature matches;\item use {\tt siftread} to read files produced by Lowe's implementation.\end{itemize}\begin{example}[Visualization]Let {\tt I1}, {\tt I2} and {\tt matches} be as in the previous example. To visualize the matches issue\begin{verbatim}  plotsiftmatches(I1,I2,frames1,frames2,matches)\end{verbatim}\end{example}The {\tt sift} function has many parameters. The default values have been chosen to emulate Lowe's original implementation. Although our code does not result in frames and descriptors that are 100\% equivalent, in general they are quite similar.% ------------------------------------------------------------------------------ \subsection{Scale space parameters}\label{sift.user.ss}% ------------------------------------------------------------------------------ The SIFT detector and descriptor are constructed from the {\em Gaussian scale space} of the source image $I(x)$. The Gaussian scale space is the function\[   G(x;\sigma) \defeq (g_\sigma*I)(x)\]where $g_\sigma$ is an isotropic Gaussian kernel of variance $\sigma^2 I$, $x$ is the spatial coordinate and $\sigma$ is the scale coordinate. The algorithm make use of another scale space too, called {\em difference of Gaussian (DOG)}, which is, coarsely speaking, the scale derivative of the Gaussian scale space. Since the scale space $G(x;\sigma)$ represents the same information (the image $I(x)$) at different levels of scale, it is sampled in a particular way to reduce redundancy. The domain of the variable $\sigma$ is discretized in logarithmic steps arranged in $O$ octaves. Each octave is further subdivided in $S$ sub-levels. The distinction between octave and sub-level is important because at each successive octave the data is spatially downsampled by half. Octaves and sub-levels are identified by a discrete {\em octave index} $o$ and {\em sub-level index} $s$ respectively. The octave index $o$ and the sub-level index $s$  are mapped to the corresponding scale $\sigma$ by the formula\be\label{eq:scale}  \sigma(o,s) = \sigma_0 2^{o+s/S},  \quad o \in o_{\min} + [0, ..., O-1],  \quad s \in [0,...,S-1]\eewhere $\sigma_0$ is the base scale level.The {\tt sift} function accepts the following parameters describing the Gaussian scale space being used:\begin{itemize}\item {\tt NumOctaves}. This is the number of octaves $O$ in \eqref{eq:scale}.\item {\tt FirstOctave}. Index of the first octave $o_{\min}$: the octave index $o$ varies in  $o_{\min},...,o_{\min}+O-1$. It is usually either $0$ or $-1$. Setting $o_{\min}$ to $-1$ has the effect of doubling the image before computing the Gaussian scale space.\item {\tt NumLevels}. This is the number of sub-levels $S$ in \eqref{eq:scale}.\item {\tt Sigma0}. Base smoothing: This is the parameter $\sigma_0$ in \eqref{eq:scale}.\item {\tt SigmaN}. Nominal pre-smoothing: This is the nominal smoothing level of the input image. The algorithm assumes that the input image is actually $(g_{\sigma_n}*I)(x)$ as opposed to $I(x)$ and adjusts the computations according. Usually $\sigma_n$ is assumed to be half pixel (0.5).\end{itemize}% ------------------------------------------------------------------------------ \subsection{Detector parameters}\label{sift.user.detector}% ------------------------------------------------------------------------------ The SIFT frames $(x,\sigma)$ are points of local extremum of the DOG scale space. The selection of such points is controlled by the following parameters:\begin{itemize}\item {\tt Threshold}. Local extrema threshold. Local extrema whose value $|G(x,;\sigma)|$ is below this number are rejected.\item {\tt EdgeThreshold}. Local extrema localization threshold. If the local extremum is on a valley, the algorithm discards it as it is too unstable. Extrema are associated with a score proportional to their sharpness and rejected if the score is below this threshold.\item {\tt RemoveBoundaryPoints}. Boundary points removal. If this parameter is set to 1 (true), frames which are too close to the boundary of the image are rejected.\end{itemize}% ------------------------------------------------------------------------------ \subsection{Descriptor parameters}\label{sift.user.descriptor}% ------------------------------------------------------------------------------ The SIFT descriptor is a weighted and interpolated histogram of the gradient orientations and locations in a patch surrounding the keypoint. The descriptor has the following parameters:\begin{itemize}\item {\tt Magnif}. Magnification factor $m$. Each spatial bin of the histogram has support of size $m \sigma$, where $\sigma$ is the scale of the frame.\item {\tt NumSpatialBins}. Number of spatial bins. Together with the next parameter, this number defines the extension and dimension of the descriptor. The dimension of the descriptor (the total number of bins) is equal to$\mathtt{NumSpatialBins}^2 \times \mathtt{NumOrientBins}$ and its extension (the patch where the gradient statistic is collected) has radius $\mathtt{NumSpatialBins} \times m\sigma/2$. \item {\tt NumOrientBins}. Number of orientation bins.\end{itemize}% ------------------------------------------------------------------------------ \subsection{Direct access to SIFT components}\label{sift.user.direct}% ------------------------------------------------------------------------------ The SIFT code is decomposed in several M and MEX files, each implementing a portion of the algorithm. These programs can be run on their own or replaced. Appendix~\ref{sift.internals} contains information useful to do this.\begin{example}[Computing the SIFT descriptor directly]Sometimes it is useful to run the descriptor code alone.This can be done by calling the function {\tt siftdescriptor} (which is actually a MEX file.) See the function help for further details.\end{example}% ------------------------------------------------------------------------------ \bibliographystyle{plain}\bibliography{bibliography}% ------------------------------------------------------------------------------\appendix% ------------------------------------------------------------------------------ \section{Internals}\label{sift.internals}% ----------------------------------------------------------------------------- % ------------------------------------------------------------------------------ \subsection{Scale spaces}\label{sift.internals.ss}% ------------------------------------------------------------------------------\begin{figure}\begin{center}\begin{tabular}{lp{0.4\textwidth}p{0.3\textwidth}}\hline

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -