首页 › 资源下载 › matlab例程 › SIFT代码 › 源码查看
sift.tex

来自「SIFT代码」· TEX 代码 · 共 334 行 · 第 1/2 页
TEX
334 行
% file:        sift.tex% author:      Andrea Vedaldi% description: SIFT code manual% AUTORIGHTS\documentclass{article}\usepackage{visionlab,xspace,tabularx}\usepackage[margin=2cm]{geometry}\usepackage{graphics}\usepackage[usenames]{color}%\usepackage{showkeys}\usepackage{hyperref}\newcommand{\x}{\mathbf{x}}\title{An implementation of SIFT detector and descriptor}\author{Andrea Vedaldi\\ University of California -- VisionLab}\date{}\definecolor{codecolor}{rgb}{0.1,0.8,0.0} \newcommand{\Matlab}{{\sc Matlab}\xspace}\let\oldtt=\tt\renewcommand{\tt}{\oldtt\color{codecolor}}\begin{document}\maketitle{}% this will add all LaTeX labels as PDF destinations% Used in combination with Hyperref, needs to be after \begin{document}\let\oldlabel=\label\renewcommand{\label}[1]{%{\pdfdest name {#1} fitbh}% xyz\oldlabel{#1}%}\tableofcontents{}% ------------------------------------------------------------------------------\section{Introduction}\label{sift.introduction}% ------------------------------------------------------------------------------These notes describe an implementation of the Scale-Invariant Transform Feature (SIFT) interest point detector and descriptor \cite{lowe04distinctive}. This implementation is designed to produce results close to Lowe's original implementation.\footnote{See {\tt http://www.cs.ubc.ca/\~ lowe/keypoints/}} The SIFT detector and descriptor are discussed in some depth in the paper  \cite{lowe04distinctive}. Here we describe the interface to our implementation and, in the appendix, some technical details.% ------------------------------------------------------------------------------ \section{User reference: the {\tt sift} function}\label{sift.user}% ------------------------------------------------------------------------------The SIFT detector and the SIFT descriptor are invoked by means of the function {\tt sift}, which provides a unified interface to both.\begin{example}[Invocation]The following lines run the SIFT detector and descriptor on the image {\tt data/test.jpg}.\begin{verbatim}  I = imread('data/test.png') ;  I = double(rgb2gray(I)/256) ;  [frames,descriptors] = sift(I, 'Verbosity', 1) ;\end{verbatim}The pair option-value \verb$'Verbosity',1$ causes the function to print a detailed progress report.\end{example}The {\tt sift} function returns a $4\times K$ matrix {\tt frames} containing the SIFT frames and a $128 \times K$ matrix {\tt descriptors} containing their descriptors. Each frame is characterized by four numbers which are in order $(x_1,x_2)$ for the center of the frame, $\sigma$ for its scale and $\theta$ for its orientation. The coordinates $(x_1,x_2)$ are relative to the upper-left corner of the image, which is assigned coordinates $(0,0)$, and may be fractional numbers (sub-pixel precision). The scale $\sigma$ is the smoothing level at which the frame has been detected. This number can also be interpreted as size of the frame, which is usually visualized as a disk of radius $6\sigma$. Each descriptor is a vector describing coarsely the appearance of the image patch corresponding to the frame (further details are discussed in Appendix~\ref{sift.internals.descriptor}). Typically this vector has dimension 128, but this number can be changed by the user as described later.Once frames and descriptors of two images $I_1$ and $I_2$ have been computed, {\tt siftmatch} can be used to estimate the pairs of matching features. This function uses Lowe's method to discard ambiguous matches~\cite{lowe04distinctive}. The result is a $2\times M$ matrix,  each column of which is a pair $(k_1,k_2)$ of indices of corresponding SIFT frames.\begin{example}[Matching]Let us assume that the images {\tt I1} and {\tt I2} have been loaded and processed as in the previous example. The code\begin{verbatim}  matches = siftmatch(descriptors1, descriptors2) ;\end{verbatim}stores in {\tt matches} the matching pairs, one per column.\end{example}The package provides some ancillary functions; you can\begin{itemize}\item use {\tt plotsiftframe} to plot SIFT frames;\item use {\tt plotsiftdescriptor} to plot SIFT descriptors;\item use {\tt plotmatches} to plot feature matches;\item use {\tt siftread} to read files produced by Lowe's implementation.\end{itemize}\begin{example}[Visualization]Let {\tt I1}, {\tt I2} and {\tt matches} be as in the previous example. To visualize the matches issue\begin{verbatim}  plotsiftmatches(I1,I2,frames1,frames2,matches)\end{verbatim}\end{example}The {\tt sift} function has many parameters. The default values have been chosen to emulate Lowe's original implementation. Although our code does not result in frames and descriptors that are 100\% equivalent, in general they are quite similar.% ------------------------------------------------------------------------------ \subsection{Scale space parameters}\label{sift.user.ss}% ------------------------------------------------------------------------------ The SIFT detector and descriptor are constructed from the {\em Gaussian scale space} of the source image $I(x)$. The Gaussian scale space is the function\[   G(x;\sigma) \defeq (g_\sigma*I)(x)\]where $g_\sigma$ is an isotropic Gaussian kernel of variance $\sigma^2 I$, $x$ is the spatial coordinate and $\sigma$ is the scale coordinate. The algorithm make use of another scale space too, called {\em difference of Gaussian (DOG)}, which is, coarsely speaking, the scale derivative of the Gaussian scale space. Since the scale space $G(x;\sigma)$ represents the same information (the image $I(x)$) at different levels of scale, it is sampled in a particular way to reduce redundancy. The domain of the variable $\sigma$ is discretized in logarithmic steps arranged in $O$ octaves. Each octave is further subdivided in $S$ sub-levels. The distinction between octave and sub-level is important because at each successive octave the data is spatially downsampled by half. Octaves and sub-levels are identified by a discrete {\em octave index} $o$ and {\em sub-level index} $s$ respectively. The octave index $o$ and the sub-level index $s$  are mapped to the corresponding scale $\sigma$ by the formula\be\label{eq:scale}  \sigma(o,s) = \sigma_0 2^{o+s/S},  \quad o \in o_{\min} + [0, ..., O-1],  \quad s \in [0,...,S-1]\eewhere $\sigma_0$ is the base scale level.The {\tt sift} function accepts the following parameters describing the Gaussian scale space being used:\begin{itemize}\item {\tt NumOctaves}. This is the number of octaves $O$ in \eqref{eq:scale}.\item {\tt FirstOctave}. Index of the first octave $o_{\min}$: the octave index $o$ varies in  $o_{\min},...,o_{\min}+O-1$. It is usually either $0$ or $-1$. Setting $o_{\min}$ to $-1$ has the effect of doubling the image before computing the Gaussian scale space.\item {\tt NumLevels}. This is the number of sub-levels $S$ in \eqref{eq:scale}.\item {\tt Sigma0}. Base smoothing: This is the parameter $\sigma_0$ in \eqref{eq:scale}.\item {\tt SigmaN}. Nominal pre-smoothing: This is the nominal smoothing level of the input image. The algorithm assumes that the input image is actually $(g_{\sigma_n}*I)(x)$ as opposed to $I(x)$ and adjusts the computations according. Usually $\sigma_n$ is assumed to be half pixel (0.5).\end{itemize}% ------------------------------------------------------------------------------ \subsection{Detector parameters}\label{sift.user.detector}% ------------------------------------------------------------------------------ The SIFT frames $(x,\sigma)$ are points of local extremum of the DOG scale space. The selection of such points is controlled by the following parameters:\begin{itemize}\item {\tt Threshold}. Local extrema threshold. Local extrema whose value $|G(x,;\sigma)|$ is below this number are rejected.\item {\tt EdgeThreshold}. Local extrema localization threshold. If the local extremum is on a valley, the algorithm discards it as it is too unstable. Extrema are associated with a score proportional to their sharpness and rejected if the score is below this threshold.\item {\tt RemoveBoundaryPoints}. Boundary points removal. If this parameter is set to 1 (true), frames which are too close to the boundary of the image are rejected.\end{itemize}% ------------------------------------------------------------------------------ \subsection{Descriptor parameters}\label{sift.user.descriptor}% ------------------------------------------------------------------------------ The SIFT descriptor is a weighted and interpolated histogram of the gradient orientations and locations in a patch surrounding the keypoint. The descriptor has the following parameters:\begin{itemize}\item {\tt Magnif}. Magnification factor $m$. Each spatial bin of the histogram has support of size $m \sigma$, where $\sigma$ is the scale of the frame.\item {\tt NumSpatialBins}. Number of spatial bins. Together with the next parameter, this number defines the extension and dimension of the descriptor. The dimension of the descriptor (the total number of bins) is equal to$\mathtt{NumSpatialBins}^2 \times \mathtt{NumOrientBins}$ and its extension (the patch where the gradient statistic is collected) has radius $\mathtt{NumSpatialBins} \times m\sigma/2$. \item {\tt NumOrientBins}. Number of orientation bins.\end{itemize}% ------------------------------------------------------------------------------ \subsection{Direct access to SIFT components}\label{sift.user.direct}% ------------------------------------------------------------------------------ The SIFT code is decomposed in several M and MEX files, each implementing a portion of the algorithm. These programs can be run on their own or replaced. Appendix~\ref{sift.internals} contains information useful to do this.\begin{example}[Computing the SIFT descriptor directly]Sometimes it is useful to run the descriptor code alone.This can be done by calling the function {\tt siftdescriptor} (which is actually a MEX file.) See the function help for further details.\end{example}% ------------------------------------------------------------------------------ \bibliographystyle{plain}\bibliography{bibliography}% ------------------------------------------------------------------------------\appendix% ------------------------------------------------------------------------------ \section{Internals}\label{sift.internals}% ----------------------------------------------------------------------------- % ------------------------------------------------------------------------------ \subsection{Scale spaces}\label{sift.internals.ss}% ------------------------------------------------------------------------------\begin{figure}\begin{center}\begin{tabular}{lp{0.4\textwidth}p{0.3\textwidth}}\hline
sift.tex - 源码说明

本页面展示了「SIFT代码」中的 sift.tex 源码文件，采用 TEX 编程语言编写，共 334 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与SIFT相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?