📄 paper.tex
字号:
% Started 11/29/00%\shortnote\lefthead{}\righthead{}\title{Multidimensional recursive filter preconditioning \\ in geophysical estimation problems}%\renewcommand{\author}[1]{%%\begin{centering}% \textbf{#1}%\end{centering} %}\author{Sergey Fomel\/\footnotemark[1]and Jon F. Claerbout\/\footnotemark[2]} % \\%(\today) \footnotetext[1]{\emph{Bureau of Economic Geology, Jackson School of Geosciences,The University of Texas at Austin,University Station, Box X,Austin, TX 78713-8972, USA}}\footnotetext[2]{\emph{Stanford Exploration Project, Mitchell Bldg., Department of Geophysics, Stanford University, Stanford, California 94305-2215, USA}}\maketitle \begin{abstract}Constraining ill-posed inverse problems often requires regularized optimization. We consider two alternative approaches to regularization. The first approach involves a column operator and an extension of the data space. It requires a regularization operator which enhances the undesirable features of the model. The second approach constructs a row operator and expands the model space. It employs a preconditioning operator, which enforces a desirable behavior, such as smoothness, of the model. In large-scale problems, when iterative optimization is incomplete, the second method is preferable, because it often leads to faster convergence. We propose a method for constructing preconditioning operators by multidimensional recursive filtering. The recursive filters are constructed by imposing helical boundary conditions. Several examples with synthetic and real data demonstrate an order of magnitude efficiency gain achieved by applying the proposed technique to data interpolation problems. \end{abstract}\section{Introduction}Regularization is a method of imposing additional conditions forsolving inverse problems with optimization methods. When modelparameters are not fully constrained by the problem (i.e. the inverseproblem is mathematically ill-posed), regularization restricts thevariability of the model and guides iterative optimization to thedesired solution by using additional assumptions about the model power,smoothness, predictability, etc. In other words, it constrains themodel null space to an \emph{a priori} chosen pattern. A thoroughmathematical theory of regularization has been introduced by works ofTikhonov's school \cite[]{tikh}.\parIn this paper, we discuss two alternative formulations of regularizedleast-squares inversion problems. The first formulation, which we call\emph{model-space}, extends the data space and constructs a compositecolumn operator. The second, \emph{data-space}, formulation extendsthe model space and constructs a composite row operator. This secondformulation is intrinsically related to the concept of modelpreconditioning \cite[]{GEO59-05-08180829}. We illustrate the generalregularization theory with simple synthetic examples.\parThough the final results of the model-space and data-spaceregularization are theoretically identical, the behavior of iterativegradient-based methods, such as the method of conjugate gradients, isdifferent for the two cases. The obvious difference is in the casewhere the number of model parameters is significantly larger than thenumber of data measurements. In this case, the dimensions of theinverted matrix in the case of the data-space regularization aresmaller than those of the model-space matrix, and the convergence ofthe iterative conjugate-gradient iteration requires correspondinglysmaller number of iterations. But even in the case where the numberof model and data parameters are comparable, preconditioning changesthe iteration behavior. This follows from the fact that the objectivefunction gradients with respect to the model parameters are differentin the two cases.%The first iteration of%the model-space regularization yields $\mathbf{L}^T \mathbf{d}$ as the%model estimate regardless of the regularization operator~$\mathbf{D}$,%while the first iteration of the data-space regularization yields%$\mathbf{C L}^T \mathbf{d}$, which is an already ``simplified'' version of%the model. Since iteration to the exact solution is rarely achieved in large-scalegeophysical applications, the results of iterative optimization may turn outquite differently. \cite{harlan} points out that the goals of themodel-space regularization conflict with each other: the first one emphasizes``details'' in the model, while the second one tries to smooth them out. Hedescribes the advantage of preconditioning as follows:\begin{quote} The two objective functions produce different results when optimization is incomplete. A descent optimization of the original (\emph{model-space}) objective function will begin with complex perturbations of the model and slowly converge toward an increasingly simple model at the global minimum. A descent optimization of the revised (\emph{data-space}) objective function will begin with simple perturbations of the model and slowly converge toward an increasingly complex model at the global minimum. \ldots A more economical implementation can use fewer iterations. Insufficient iterations result in an insufficiently complex model, not in an insufficiently simplified model.\end{quote}In this paper, we illustrate the two approaches on synthetic and real dataexamples from simple environmental data sets. All examples show that when wesolve the optimization problem iteratively and take the output only after alimited number of iterations, it is preferable to use the preconditioningapproach. When the regularization operator is convolution with a filter, anatural choice for preconditioning is inverse filtering (recursivedeconvolution). We show how to extend the method of preconditioning byrecursive filtering to multiple dimensions. The extention is based onmodifying the boundary conditions with the helix transform \cite[]{helix}.%Invertible multidimensional filters can be%created by helical spectral factorization.\section{Review of regularization in estimation problems}The basic linear system of equations for least-squares optimizationcan be written in the form\begin{equation} \mathbf{d - L m \approx 0}\;,\label{eqn:main}\end{equation}where $\mathbf{m}$ is the vector of model parameters, $\mathbf{d}$ is thevector of experimental data values, $\mathbf{L}$ is the modelingoperator, and the approximate equality implies finding the solution byminimizing the power of the left-hand side. The goal of linearestimation is to estimate an optimal~$\mathbf{m}$ for agiven~$\mathbf{d}$.\subsection{Model-space regularization}Model-space regularization implies adding equations tosystem~(\ref{eqn:main}) to obtain a fully constrained (well-posed)inverse problem. The additional equations take the form\begin{equation}\epsilon \mathbf{D m \approx 0} \;,\label{eqn:mreg}\end{equation}where $\mathbf{D}$ is a linear operator that represents additionalrequirements for the model, and $\epsilon$ is the scaling parameter.In many applications, $\mathbf{D}$ can be thought of as a filter,enhancing undesirable components in the model, or as the operator ofa differential equation that we assume the model should satisfy.The full system of equations (\ref{eqn:main}-\ref{eqn:mreg}) can bewritten in a short notation as\begin{equation} \mathbf{G_m m} = \left[\begin{array}{c} \mathbf{L} \\ \epsilon \mathbf{D} \end{array}\right] \mathbf{m} \approx \left[\begin{array}{c} \mathbf{d} \\ \mathbf{0} \end{array}\right] = \hat{\mathbf{d}}\;,\label{eqn:main-m}\end{equation}where $\hat{\mathbf{d}}$ is the augmented data vector:\begin{equation}\hat{\mathbf{d}} = \left[\begin{array}{c} \mathbf{d} \\ \mathbf{0} \end{array}\right]\;,\label{eqn:dhat}\end{equation}and $\mathbf{G_m}$ is a \emph{column} operator:\begin{equation}\mathbf{G_m} = \left[\begin{array}{c} \mathbf{L} \\ \epsilon \mathbf{D} \end{array}\right]\;.\label{eqn:gm}\end{equation}The estimation problem (\ref{eqn:main-m}) is fully constrained. We cansolve it by means of unconstrained least-squares optimization,minimizing the least-squares norm ofthe compound residual vector\begin{equation}\hat{\mathbf{r}} = \hat{\mathbf{d}} - \mathbf{G_m m} =\left[\begin{array}{c} \mathbf{d - L m}\\ - \epsilon \mathbf{D m} \end{array}\right]\;.\label{eqn:rhat}\end{equation}The formal solution of the regularized optimization problem has theknown form \cite[]{parker}\begin{equation} <\!\!\mathbf{m}\!\!> = \left(\mathbf{L}^T\,\mathbf{L} + \epsilon^2\,\mathbf{D}^T\,\mathbf{D}\right)^{-1}\,\mathbf{L}^T\,\mathbf{d}\;, \label{eqn:minv1} \end{equation}where $<\!\!\mathbf{m}\!\!>$ denotes the least-squares estimate of $\mathbf{m}$, and $\mathbf{L}^T$ denotes the adjoint operator.One can carry out the optimization iteratively with the help of theconjugate-gradient method \cite[]{Hestenes.1952} or its analogs\cite[]{Paige.acm.8.195}.In the next subsection, we describe an alternative formulation of theoptimization problem.\subsection{Data-space regularization (model preconditioning)}The data-space regularization approach is closely connected with theconcept of \emph{model preconditioning}.We can introduce a new model $\mathbf{p}$ withthe equality\begin{equation}\mathbf{m = P p}\;,\label{eqn:prec}\end{equation}where $\mathbf{P}$ is a preconditioning operator.The residual vector $\mathbf{r}$ for the data-fittinggoal~(\ref{eqn:main}) can be defined by the relationship\begin{equation}\epsilon \mathbf{r = d - L m = d - L P p}\;,\label{eqn:r}\end{equation}where $\epsilon$ is the same scaling parameter as in~(\ref{eqn:mreg})-- the reason for this choice will be clear from the analysis thatfollows. Let us consider a compound model $\hat{\mathbf{p}}$, composedof the preconditioned model vector $\mathbf{p}$ and the residual$\mathbf{r}$:\begin{equation}\label{eqn:hatp}\hat{\mathbf{p}} = \left[\begin{array}{c} \mathbf{p} \\ \mathbf{r} \end{array}\right]\;.\end{equation}With respect to the compound model, we can rewrite equation(\ref{eqn:r}) as\begin{equation} \left[\begin{array}{cc} \mathbf{L P} & \epsilon \mathbf{I} \end{array}\right]\left[\begin{array}{c} \mathbf{p} \\ \mathbf{r} \end{array}\right] = \mathbf{G_d} \hat{\mathbf{p}} = \mathbf{d}\;,\label{eqn:main-d}\end{equation}where $\mathbf{G_d}$ is a \emph{row} operator:\begin{equation}\mathbf{G_d} = \left[\begin{array}{cc} \mathbf{L P} & \epsilon \mathbf{I} \end{array}\right]\;,\label{eqn:gd}\end{equation}and $\mathbf{I}$ represents the data-space identity operator.Equation (\ref{eqn:main-d}) is clearly underdetermined with respect tothe compound model $\hat{\mathbf{p}}$. If from all possible solutions ofthis system we seek the one with the minimal power $\hat{\mathbf{p}}^T\hat{\mathbf{p}}$, the formal result takes the well-known form\begin{equation}<\!\!\hat{\mathbf{p}}\!\!> = \left[\begin{array}{c} <\!\!\mathbf{p}\!\!> \\ <\!\!\mathbf{r}\!\!>\end{array} \right] =\mathbf{G_d}^T \left(\mathbf{G_d}\,\mathbf{G_d}^T\right)^{-1} \mathbf{d} =\left[\begin{array}{c}\mathbf{P}^T \mathbf{L}^T \left(\mathbf{L P P}^T \mathbf{L}^T + \epsilon^2 \mathbf{I}\right)^{-1} \mathbf{d} \\\epsilon \left(\mathbf{L P P}^T \mathbf{L}^T + \epsilon^2 \mathbf{I}\right)^{-1} \mathbf{d}\end{array} \right]\;.\label{eqn:dsol-i}\end{equation}Applying equation~(\ref{eqn:prec}), we obtain the correspondingestimate $<\!\!\mathbf{m}\!\!>$ for the initial model $\mathbf{m}$, asfollows:\begin{equation} \label{eqn:dinv1} <\!\!\mathbf{m}\!\!> = \mathbf{P}\,\mathbf{P}^T\,\mathbf{L}^T\,\left( \mathbf{L}\,\mathbf{P}\,\mathbf{P}^T\,\mathbf{L}^T + \epsilon^2\,\mathbf{I}\right)^{-1}\, \mathbf{d}\;.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -