📄 paper.tex
字号:
Fourier-transformed variables are often capitalized.This convention will be helpful here,so in this subsection only,we capitalize vectors transformed by the $\bold F$ matrix.As everywhere, a matrix such as $\bold F$is printed in {\bf boldface} typebut in this subsection,vectors are {\it not} printed in boldface print.Thus we define the solution, the solution step(from one iteration to the next),and the gradient by\begin{eqnarray}X &=& \bold F \ x\ \quad\quad\quad {\rm solution} \\S_j &=& \bold F \ s_j \quad\quad\quad {\rm solution\ step} \\G_j &=& \bold F \ g_j \quad\quad\quad {\rm solution\ gradient} \end{eqnarray}A linear combination in solution space,say $s+g$, corresponds to $S+G$ in the conjugate space,because $S+G = \bold F s + \bold F g = \bold F(s+g)$.According to equation (\ref{eqn:cg1b}),the residual is the theoretical data minus the observed data.\begin{equation}R \eq \bold F x \ -\ D \eq X \ -\ D\end{equation}The solution $x$ is obtained by a succession of steps $s_j$, say\begin{equation}x \eq s_1 \ +\ s_2 \ +\ s_3 \ +\ \cdots\end{equation}The last stage of each iteration is to update the solution and the residual:\begin{eqnarray}\label{eqn:xupdate}{\rm solution\ update:} \quad\quad\quad & x \ \leftarrow x& +\ s\\\label{eqn:Rupdate}{\rm residual\ update:} \quad\quad\quad & R \ \leftarrow R& +\ S\end{eqnarray}\parThe {\it gradient} vector $g$ is a vector with the same numberof components as the solution vector $x$.A vector with this number of components is\begin{eqnarray}g &=& \bold F' \ R \eq \hbox{gradient} \label{eqn:g} \\G &=& \bold F \ g \eq \hbox{conjugate gradient} \label{eqn:G}\end{eqnarray}The gradient $g$ in the transformed space is $G$,also known as the \bx{conjugate gradient}.\parThe minimization (\ref{eqn:mindot}) is now generalizedto scan not only the line with $\alpha$,but simultaneously another line with $\beta$.The combination of the two lines is a plane:\begin{equation}Q(\alpha ,\beta ) \eq( R + \alpha G + \beta S) \ \cdot\ (R + \alpha G + \beta S )\label{eqn:cgqf}\end{equation}The minimum is found at $\partial Q / \partial \alpha \,=\,0$ and$\partial Q / \partial \beta \,=\,0$, namely,\begin{equation}0\eq G \ \cdot\ ( R + \alpha G + \beta S )\end{equation}\begin{equation}0\eq S \ \cdot\ ( R + \alpha G + \beta S )\end{equation}The solution is\begin{equation}\label{eqn:twobytwosln}\left[ \begin{array}{c} \alpha \\ \beta \end{array} \right] \ = \ {-1 \over (G\cdot G)(S\cdot S)-(G\cdot S)^2}\left[ \begin{array}{rr} (S \cdot S) & -(S \cdot G) \\ -(G \cdot S) & (G \cdot G) \end{array} \right] \; \left[ \begin{array}{c} (G\cdot R) \\ (S\cdot R) \end{array} \right]\end{equation}This may look complicated.The steepest descent method requires us to computeonly the two dot products$ \bold r \cdot \Delta \bold r$ and$\Delta \bold r \cdot \Delta \bold r$while equation (\ref{eqn:cgqf}) contains five dot products,but the extra trouble is well worth while because the ``conjugate direction''is such a much better direction than the gradient direction.\parThe many applications in this book all need tofind $\alpha$ and $\beta$ with (\ref{eqn:twobytwosln}) and thenupdate the solution with (\ref{eqn:xupdate}) andupdate the residual with (\ref{eqn:Rupdate}).Thus we package these activities in a subroutinenamed \texttt{cgstep}.To use that subroutine we will have a computation \bx{template}like we had for steepest descents, except that wewill have the repetitive work done by subroutine {\tt cgstep}.This template (or pseudocode) for minimizing the residual$\bold 0\approx \bold r = \bold F \bold x - \bold d$by the conjugate-direction method is\label{lsq/'cgtemplate'}\begin{tabbing}mmmmmm \= mmmmmm \= mmmmm \kill\> $\bold r \padarrow \bold F \bold x - \bold d$ \\\> {\rm iterate \{ } \\\> \> $\Delta \bold x \padarrow \bold F'\ \bold r$ \\\> \> $\Delta \bold r\ \padarrow \bold F \ \Delta \bold x$ \\\> \> $(\bold x,\bold r) \padarrow {\rm cgstep} (\bold x, \Delta\bold x, \bold r, \Delta\bold r )$ \\\> \> \} \end{tabbing}wherethe subroutine {\tt cgstep()}remembers the previous iteration andworks out the step size and adds inthe proper proportion of the $\Delta \bold x$ ofthe previous step.%\par%Most of the least-squares solver subroutines in this book%follow the above \bx{template}.%They may look less complicated when they start from $\bold x=\bold 0$%or more complicated when $\bold F$ has several parts.%$\bold F$ is often a partitioned matrix of operators%and the code for applying it%will have subtle differences from the code%for applying its adjoint $\bold F'$.%Often we fit two expressions simultaneously,%$\bold 0\approx \bold L\bold x-\bold d$ and%$\bold 0\approx \bold A\bold x$, and then%$\bold F$%is the column matrix%\begin{equation}%\bold F \eq% \left[% \begin{array}{c}% \bold L \\% \bold A% \end{array}% \right]%\end{equation}%\begin{notforlecture}\subsection{Routine for one step of conjugate-direction descent}\par\begin{comment}The \bx{conjugate-direction program}can be divided into two parts:an inner part that is used almost without changeover a wide variety of applications,and an outer part containing memory allocations,operator invocations, and initializations.Because \bx{Fortran} does not recognize the difference between upper- andlower-case letters,\end{comment}The conjugate vectors $G$ and $S$ in the program are denoted by{\tt gg} and {\tt ss}.The inner part of the conjugate-direction task is infunction {\tt cgstep()}.%%\moddex[f90]{cgstep}{one step of CD}\moddex{cgstep}{one step of CD}{38}{84}{filt/lib}\parObserve the \texttt{cgstep()} function has a logical parametercalled \texttt{forget}.This parameter does not need to be input.In the normal course of things, \texttt{forget} will be trueon the first iteration and false on subsequent iterations.This refers to the fact that on the first iteration,there is no previous step,so the conjugate direction methodis reduced to the steepest descent method.At any iteration, however, you have the option to set\texttt{forget=.true.}which amounts to restarting the calculationfrom the current location,something we rarely find reason to do.\subsection{A basic solver program}There are many different methods for iterative least-square estimationsome of which will be discussed later in this book.The conjugate-gradient (CG) family(including the first order conjugate-direction method described above)share the property that theoretically they achieve the solutionin $n$ iterations, where $n$ is the number of unknowns.The various CG methods differin their numerical errors,memory required,adaptability to non-linear optimization,and their requirements on accuracy of the adjoint.What we do in this section is to show you the generic interface.\parNone of us is an expert in both geophysics and in optimization theory (OT),yet we need to handle both.We would like to have each group write its own code witha relatively easy interface.The problem is that the OT codes must invoke the physical operatorsyet the OT codes should notneed to deal with all the data and parameters needed by the physical operators.\parIn other words,if a practitioner decides toswap one solver for another,the only thing needed is the name of the new solver.\parThe operator entrance is for the geophysicist,who formulates the estimation problem.The solver entrance is for the specialist in numerical algebra,who designs a new optimization method.The C programming language allows usto achieve this design goal by means of generic function interfaces.\parA basic solver is \texttt{tinysolver}.\moddex{tinysolver}{tiny solver}{23}{53}{filt/lib}\parThe two most important arguments in \texttt{tinysolver()}are the operator function \texttt{Fop},which is defined by the interface from Chapter \ref{ajt/paper:ajt},and the stepper function \texttt{stepper},which implements one step of an iterative estimation.For example, a practitioner who choses to use our new\texttt{cgstep()} \vpageref{lst:cgstep}for iterative solving the operator\texttt{matmult} \vpageref{lst:matmult}would write the call\par\texttt{tinysolver ( matmult\_lop, cgstep, ...}\par\noindentso while you are reading the \texttt{tinysolver} module,you should visualize the \texttt{Fop()} functionas being \texttt{matmult\_lop}, andyou should visualize the \texttt{stepper()} functionas being \texttt{cgstep}.\parThe other required parameters to \texttt{tinysolver()} are \texttt{d} (the data we want to fit),\texttt{m} (the model we want to estimate),and \texttt{niter} (the maximum number of iterations).There are also a couple of optional arguments.For example, \texttt{m0} is the starting guess for the model.If this parameter is omitted, the model is initialized to zero.To output the final residual vector,we include a parameter called \texttt{resd},which is optional as well.We will watch how the list of optional parametersto the generic solver routine growsas we attack more and more complex problems in later chapters.\subsection{Why C is much better than Fortran 77}I'd like to digress from our geophysics-mathematics themesto explain why C has been a great step forwardover Fortran 77.All the illustrations in this book were originally computed in F77.Then module\texttt{tinysolver} \vpageref{lst:tinysolver}was simply a subroutine.It was not one module for the whole book, as it is now,but it was many conceptually identical subroutines,dozens of them, one subroutine for each application.The reason for the proliferation was that F77 lacks the ability of Cto represent operators as having two ways to enter,one for science and another for math.On the other hand, F77 did not require the half a pageof definitions that we see here in C.But the definitions are not difficult to understand,and they are a clutter that we must see once and never again.Another benefit is that the book in F77 had no easy way to switchfrom the \texttt{cgstep} solver to other solvers.\subsection{Test case: solving some simultaneous equations}\parNow we assemble a module \texttt{cgtest} for solving simultaneous equations.Starting with the conjugate-direction module {\tt cgstep} \vpageref{lst:cgstep}we insert the module \texttt{matmult} \vpageref{lst:matmult} as the linear operator.\moddex{cgtest}{demonstrate CD}{23}{30}{user/fomels}\parBelow shows the solution to $5 \times 4$ set of simultaneous equations.Observe that the ``exact'' solution is obtained in the last step.Because the data and answers are integers,it is quick to check the result manually.\par\noindent\footnotesize\begin{verbatim}d transpose 3.00 3.00 5.00 7.00 9.00F transpose 1.00 1.00 1.00 1.00 1.00 1.00 2.00 3.00 4.00 5.00 1.00 0.00 1.00 0.00 1.00 0.00 0.00 0.00 1.00 1.00for iter = 0, 4x 0.43457383 1.56124675 0.27362058 0.25752524res -0.73055887 0.55706739 0.39193487 -0.06291389 -0.22804642x 0.51313990 1.38677299 0.87905121 0.56870615res -0.22103602 0.28668585 0.55251014 -0.37106210 -0.10523783x 0.39144871 1.24044561 1.08974111 1.46199656res -0.27836466 -0.12766013 0.20252672 -0.18477242 0.14541438x 1.00001287 1.00004792 1.00000811 2.00000739res 0.00006878 0.00010860 0.00016473 0.00021179 0.00026788x 1.00000024 0.99999994 0.99999994 2.00000024res -0.00000001 -0.00000001 0.00000001 0.00000002 -0.00000001\end{verbatim}\normalsize\begin{exer}\itemOne way to remove a mean value $m$ from signal $s(t)= \bold s$is with the fitting goal $\bold 0 \approx \bold s - m$.What operator matrix is involved?\itemWhat linear operator subroutine from Chapter \ref{paper:ajt}can be used for finding the mean?\itemHow many CD iterations should be required to get the exact mean value?\itemWrite a mathematical expression for finding the mean by the CG method.\end{exer}%\end{notforlecture}\section{INVERSE NMO STACK}\inputdir{invstack}\sx{NMO stack}To illustrate an example of solving a huge set of simultaneousequations without ever writing down the matrix of coefficientswe consider how{\it \bx{back projection}} can be upgraded towards{\it \bx{inversion}} in the application called \bx{moveout and stack}.\sideplot{invstack}{width=3in}{ Top is a model trace $\bold m$. Next are the synthetic data traces, $\bold d = \bold M \bold m$. Then, labeled {\tt niter=0} is the \protect\bx{stack}, a result of processing by adjoint modeling. Increasing values of {\tt niter} show $\bold x$ as a function of iteration count in the fitting goal $\bold d \approx \bold M \bold m$. (Carlos Cunha-Filho)}\parThe seismograms at the bottom of Figure~\ref{fig:invstack}show the first four iterations of conjugate-direction inversion.You see the original rectangle-shaped waveform returningas the iterations proceed.Notice also on the \bx{stack}that the early and late events have unequal amplitudes,but after enough iterations they are equal,as they began.Mathematically,we can denote the top tr
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -