📄 paper.tex

📁 国外免费地震资料处理软件包
💻 TEX
📖 第 1 页 / 共 2 页
字号:
上一页 12
\begin{enumerate}\item Choose initial model ${\bf m}_0$ and compute the residual ${r}_0= {\bf d - A\,m}_0$.  \item At $n$-th iteration, choose the initial search direction ${\bf c}_n$.\item If $n$ is greater than 1, optimize the search direction byadding a linear combination of the previous directions, according toequations (\ref{eqn:step}) and (\ref{eqn:betaj}), and compute the modified stepdirection ${\bf s}_n$.\item Find the step length $\alpha_n$ according to equation(\ref{eqn:alpha}). The orthogonality principles (\ref{eqn:asj}) and (\ref{eqn:rasn}) cansimplify this equation to the form\begin{equation}\alpha_n = {{\left({\bf r}_{n-1},\,{\bf A\,c}_n\right)} \over{\|{\bf A\,s}_n\|^2}}\;.\label{eqn:alphan}\end{equation}\item Update the model ${\bf m}_n$ and the residual ${\bf r}_n$according to equations (\ref{eqn:mn}) and (\ref{eqn:rn}). \item Repeat iterations until the residual decreases to the requiredaccuracy or as long as it is practical. \end{enumerate}At each of the subsequent steps, the residual is guaranteed not toincrease according to equation (\ref{eqn:pythagor}). Furthermore,optimizing the search direction guarantees that the convergence ratedoesn't decrease in comparison with (\ref{eqn:deltar}). The only assumptionwe have to make to arrive at this conclusion is that theoperator ${\bf A}$ is linear. However, without additional assumptions, we cannotguarantee global convergence of the algorithm to the least-squaresolution of equation (\ref{eqn:equation}) in a finite number of steps.\section{WHAT ARE ADJOINTS FOR? THE METHOD OF CONJUGATE GRADIENTS}The adjoint operator ${\bf A}^T$ projects the data space back to themodel space and is defined by the dot product test\begin{equation}\left({\bf d},\,{\bf A\,m}\right) \equiv\left({\bf A}^T\,{\bf d},\,{\bf m}\right)\label{eqn:dottest}\end{equation}for any ${\bf m}$ and ${\bf d}$. The method of conjugate gradients isa particular case of the method of conjugate directions, where theinitial search direction ${\bf c}_n$ is \begin{equation}{\bf c}_n = {\bf A}^T\,{\bf r}_{n-1}\;.\label{eqn:gradient}\end{equation}This direction is often called the {\em gradient,} because itcorresponds to the local gradient of the squared residual norm withrespect to the current model ${\bf m}_{n-1}$. Aligning the initialsearch direction along the gradient leads to the following remarkablesimplifications in the method of conjugate directions.\subsection{Orthogonality of the gradients}%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%The orthogonality principle (\ref{eqn:rasnj}) transforms according to thedot-product test (\ref{eqn:dottest}) to the form\begin{equation}\left({\bf r}_{n-1},\,{\bf A\,s}_{j}\right) = \left({\bf A}^T\,{\bf r}_{n-1},\,{\bf s}_{j}\right) =\left({\bf c}_{n},\,{\bf s}_{j}\right) =0\;,\;\;1 \leq j \leq n-1\;.\label{eqn:csj}\end{equation} Forming the dot product $\left({\bf c}_{n},\,{\bf c}_{j}\right)$ andapplying formula (\ref{eqn:step}), we can see that\begin{equation}\left({\bf c}_{n},\,{\bf c}_{j}\right) =\left({\bf c}_{n},\,{\bf s}_{j} - \sum_{i=1}^{i=j-1}\,\beta_n^{(i)}\,{\bf s}_{i}\right) = \left({\bf c}_{n},\,{\bf s}_{j}\right) -\sum_{i=1}^{i=j-1}\,\beta_n^{(i)}\,\left({\bf c}_{n},\,{\bf s}_{i}\right) =0\;,\;\;1 \leq j \leq n-1\;.\label{eqn:cncj}\end{equation} Equation (\ref{eqn:cncj}) proves the orthogonality of the gradient directions fromdifferent iterations. Since the gradients are orthogonal, after $n$iterations they form a basis in the $n$-dimensional space. In otherwords, if the model space has $n$ dimensions, each vector in thisspace can be represented by a linear combination of the gradientvectors formed by $n$ iterations of the conjugate-gradientmethod. This is true as well for the vector ${\bf m}_0 - {\bf m}$, whichpoints from the solution of equation (\ref{eqn:equation}) to the initialmodel estimate ${\bf m}_0$. Neglecting computational errors, it takesexactly $n$ iterations to find this vector by successive optimizationof the coefficients. This proves that theconjugate-gradient method converges to the exact solution in afinite number of steps (assuming that the model belongs to afinite-dimensional space).\parThe method of conjugate gradients simplifies formula (\ref{eqn:alphan})to the form\begin{equation}\alpha_n = {{\left({\bf r}_{n-1},\,{\bf A\,c}_n\right)} \over{\|{\bf A\,s}_n\|^2}} ={{\left({\bf A}^T\,{\bf r}_{n-1},\,{\bf c}_n\right)} \over{\|{\bf A\,s}_n\|^2}} ={{\|{\bf c}_n\|^2} \over {\|{\bf A\,s}_n\|^2}}\;,\label{eqn:alphag}\end{equation}which in turn leads to the simplification of formula (\ref{eqn:pythagor}),as follows:\begin{equation}\|{\bf r}_n\|^2 = \|{\bf r}_{n-1}\|^2 - {{\|{\bf c}_n\|^4}\over{\|{\bf A\,s}_n\|^2}}\;.\label{eqn:pythagog}\end{equation}If the gradient is not equal to zero, the residual is guaranteed todecrease. If the gradient is equal to zero, we have alreadyfound the solution.\subsection{Short memory of the gradients}Substituting the gradient direction (\ref{eqn:gradient}) into formula(\ref{eqn:betaj}) and applying formulas (\ref{eqn:rn}) and (\ref{eqn:dottest}), we cansee that\begin{equation}\beta_n^{(j)} = {{\left({\bf A\,c}_n,\,{\bf r}_{j} - {\bf r}_{j-1}\right)} \over{\alpha_j\|{\bf A\,s}_{j}\|^2}} ={{\left({\bf c}_n,\,{\bf A}^T\,{\bf r}_{j} - {\bf A}^T\,{\bf r}_{j-1}\right)} \over{\alpha_j\|{\bf A\,s}_{j}\|^2}} ={{\left({\bf c}_n,\,{\bf c}_{j+1} - {\bf c}_{j}\right)} \over{\alpha_j\|{\bf A\,s}_{j}\|^2}}\;.\label{eqn:betag}\end{equation} The orthogonality condition (\ref{eqn:cncj}) and the definition of thecoefficient $\alpha_j$ from equation (\ref{eqn:alphag}) further transform this formula to the form\begin{eqnarray}\beta_n^{(n-1)} & = &{{\|{\bf c}_n\|^2} \over{\alpha_{n-1}\|{\bf A\,s}_{n-1}\|^2}} ={{\|{\bf c}_n\|^2} \over{\|{\bf c}_{n-1}\|^2}}\;,\label{eqn:betag1} \\\beta_n^{(j)}  & = & 0\;,\;\;1 \leq j \leq n-2\;.\label{eqn:betag2}\end{eqnarray}Equation (\ref{eqn:betag2}) shows that the conjugate-gradient method needsto remember only the previous step direction in order to optimize thesearch at each iteration. This is another remarkable propertydistinguishing that method in the family of conjugate-directionmethods.\section{PROGRAM}The program in Table~\ref{tbl:program} implements oneiteration of the conjugate-direction method. It is based upon JonClaerbout's \verb!cgstep()! program \cite[]{gee} and uses an analogousnaming convention. Vectors in the data space are denoted by doubleletters.\tabl{program}{The source of this program is \texttt{RSF/filt/lib/cdstep.c}}{\lstinputlisting[frame=single,firstline=46,lastline=86]{\RSF/filt/lib/cdstep.c}}%\listing{cdstep.rt} %In addition to the previous steps ${\bf s}_j$ (array%\verb!s!) and their conjugate counterparts ${\bf A\,s}_j$ (array%\verb!ss!), the program stores the squared norms $\|{\bf A\,s}_j\|^2$%(array \verb!ssn!) to avoid recomputation.%For practical reasons, the number of remembered iterations%\verb!niter! can actually be smaller than the total number of%iterations. The value \verb!niter=2! corresponds to the%conjugate-gradient method. The value \verb!niter=1! corresponds to the%steepest-descent method. The iteration process should start with%\verb!iter = 1!, corresponding to the first steepest-descent iteration%in the method of conjugate gradients.In addition to the previous steps ${\bf s}_j$ and their conjugatecounterparts ${\bf A\,s}_j$ (array \verb!s!), the program stores thesquared norms $\|{\bf A\,s}_j\|^2$ (variable \verb!beta!) to avoidrecomputation.  For practical reasons, the number of rememberediterations can actually be smaller than the total number ofiterations.%The value \verb!niter=2! corresponds to the%conjugate-gradient method. The value \verb!niter=1! corresponds to the%steepest-descent method. The iteration process should start with%\verb!iter = 1!, corresponding to the first steepest-descent iteration%in the method of conjugate gradients.\section{EXAMPLES}%%%%%%%%%%%%%%%%\subsection{Example 1: Inverse interpolation}%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\inputdir{inter}Matthias Schwab has suggested (in a personal communication) aninteresting example, in which the \verb!cgstep! program fails tocomply with the conjugate-gradient theory. The inverse problem is asimple one-dimensional data interpolation with a known filter\cite[]{gee}. The known portion of the data is a singlespike in the middle. One hundred other data points are consideredmissing. The known filter is the Laplacian $(1,-2,1)$, and theexpected result is a bell-shaped cubic spline. The forward problem isstrictly linear, and the exact adjoint is easily computed by reverseconvolution. However, the conjugate-gradient program requiressignificantly more than the theoretically predicted 100iterations. Figure \ref{fig:dirmcg} displays the convergence to thefinal solution in three different plots. According to the figure, theactual number of iterations required for convergence is about300. Figure \ref{fig:dirmcd} shows the result of a similar experimentwith the conjugate-direction solver \verb!cdstep!. The number ofrequired iterations is reduced to almost the theoretical onehundred. This indicates that the orthogonality of directions impliedin the conjugate-gradient method has been distorted by computationalerrors. The additional cost of correcting these errors with theconjugate-direction solver comes from storing the preceding 100directions in memory. A smaller number of memorized steps producessmaller improvements.\plot{dirmcg}{width=6in,height=2in}{Convergence of themissing data interpolation problem with the conjugate-gradientsolver. Current models are plotted against the number ofiterations. The three plots are different displays of the same data.}\plot{dirmcd}{width=6in,height=2in}{Convergence of themissing data interpolation problem with the long-memoryconjugate-direction solver. Current models are plotted against thenumber of iterations. The three plots are different displays of the samedata.}\subsection{Example 2: Velocity transform}%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\inputdir{veltran}The next test example is the velocity transform inversion with a CMPgather from the Mobil AVO dataset\cite[]{Nichols.sep.82.1,Lumley.sep.82.25,Lumley.sep.82.63}. I use JonClaerbout's \verb!veltran! program \cite[]{Claerbout.bei.95} foranti-aliased velocity transform with rho-filter preconditioningand compare three different pairs of operators for inversion. Thefirst pair is the CMP stacking operator with the ``migration''weighting function $\left(w = {{\left(t_0/t\right)} \over\sqrt{t}}\right)$ and its adjoint. The second pair isthe ``pseudo-unitary'' velocity transform with the weightingproportional to $\sqrt{|s\,x|}$, where $x$ is the offset and $s$ isthe slowness. These two pairs were used in the velocity transforminversion with the iterative conjugate-gradient solver. The third pairuses the weight proportional to $|x|$ for CMP stacking and $|s|$ forthe reverse operator. Since these two operators are not exactadjoints, it is appropriate to apply the method of conjugatedirections for inversion. The convergence of the three differentinversions is compared in Figure \ref{fig:diritr}. We can see that thethird method reduces the least-square residual error, though it has asmaller effect than that of the pseudo-unitary weighting in comparisonwith the uniform one. The results of inversion after 10conjugate-gradient iterations are plotted in Figures \ref{fig:dircvv} and\ref{fig:dirrst}, which are to be compared with the analogous results of\cite{Lumley.sep.82.63} and \cite{Nichols.sep.82.1}.\plot{diritr}{width=6in,height=3in}{Comparison ofconvergence of the iterative velocity transform inversion. The leftplot compares conjugate-gradient inversion with unweighted (uniformlyweighted) and pseudo-unitary operators. The right plot comparespseudo-unitary conjugate-gradient and weighted conjugate-directioninversion.}  \plot{dircvv}{width=6in,height=3in}{InputCMP gather (left) and its velocity transform counterpart (right) after10 iterations of conjugate-direction inversion.}\plot{dirrst}{width=6in,height=3in}{The modeled CMPgather (left) and the residual data (right) plotted at the same scale.}\begin{comment}\subsection{Example 3: Leveled inverse interpolation}%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%The third example is the linearized nonlinear inversion forinterpolating the SeaBeam dataset \cite[]{gee,Crawley.sep.84.279}. Thisinterpolation problem is nonlinear because the pre\-dic\-tion-errorfil\-ter is esti\-mated simul\-ta\-neously with the missing data. Theconjugate-gradient solver showed a very slow convergence in this case.Figure \ref{fig:dirjbm} compares the results of the conjugate-gradientand conjugate-direction methods after 2500 iterations. Because of thelarge scale of the problem, I set \verb!niter=4! in the\verb!cdstep()!  program, storing only the three preceding steps ofthe conjugate-direction optimization. The acceleration of convergenceproduced a noticeably better interpolation, which is visible in thefigure.%\plot{dirjbm}{width=6in,height=3in}{SeaBeam interpolation.%Left plot: the result of the conjugate-gradient inversion%after 2500 iterations. Right plot: the result of the short-memory%conjugate-direction inversion after 2500 iterations.}\end{comment} \section{Conclusions}%%%%The conjugate-gradient solver is a powerful method of least-squareinversion because of its remarkable algebraic properties. In practice,the theoretical basis of conjugate gradients can be distorted bycomputational errors. In some applications of inversion, we may wantto do that on purpose, by applying inexact adjoints inpreconditioning.  In both cases, a safer alternative is the method ofconjugate directions. Jon Claerbout's \verb!cgstep()! program actuallyimplements a short-memory version of the conjugate-direction method.Extending the length of the memory raises the cost of iterations, butcan speed up the convergence.\bibliographystyle{seg}\bibliography{paper,SEG,SEP2}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End:
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -