📄 paper.tex
字号:
% copyright (c) 1997 Jon Claerbout\title{Empty bins and inverse interpolation}\author{Jon Claerbout}\maketitle\label{paper:iin} \parLet us review the big picture.In Chapter \ref{ajt/paper:ajt} we developed adjoints andin Chapter \ref{lsq/paper:lsq} we developed inverse operators.Logically, correct solutions come only through inversion.Real life, however, seems nearly the opposite.This is puzzling but intriguing.\parEvery time you fill your car with gasoline,it derives much more from the adjoint than from inversion.I refer to the fact that ``practical seismic data processing''relates much more to the use of adjoints than of inverses.It has been widely known for about the last 15 yearsthat medical imaging and all basic image creation methods are like this.It might seem that an easy path to fame and profit wouldbe to introduce the notion of inversion,but it is not that easy.Both cost and result quality enter the picture.\parFirst consider cost.For simplicity, consider a data space with $N$ valuesand a model (or image) space of the same size.The computational cost of applying a dense adjointoperator increases in direct proportion to the numberof elements in the matrix, in this case $N^2$.To achieve the minimum discrepancy between theoretical dataand observed data (inversion) theoretically requires $N$ iterationsraising the cost to $N^3$.\parConsider an image of size $m\times m=N$.Continuing, for simplicity, to assume a dense matrix of relations betweenmodel and data,the cost for the adjoint is $m^4$ whereas the cost for inversion is $m^6$.We'll consider computational costs for the year 2000, butnoticing that costs go as the sixth power of the mesh size,the overall situation will not change much in the foreseeable future.Suppose you give a stiff workout to a powerful machine;you take an hour to invert a $4096\times 4096$ matrix.The solution, a vector of $4096$ components couldbe laid into an image of size $64\times 64= 2^6\times 2^6 = 4096$.Here is what we are looking at for costs:\par\begin{center}\begin{tabular}{||r|r|r|r|r||} \hlineadjoint cost &$(m\times m )^2$ & $(512\times 512)^2$ & $(2^9 2^9)^2$ & $2^{36}$\\ \hlineinverse cost &$(m\times m )^3$ & $ (64\times 64)^3$ & $(2^6 2^6)^3$ & $2^{36}$\\ \hline\end{tabular}\end{center}\par\noindentThese numbers tell us that for applications with dense operators,the biggest images that we are likely to see coming from inversion methodsare $64\times 64$ whereas those from adjoint methods are $512\times 512$.For comparison, the retina of your eye is comparable to your computerscreen at $1000\times 1000$.We might summarize by saying that while adjoint methods are less than perfect,inverse methods are ``legally blind'' :-)\par\noindent\url{http://sepwww.stanford.edu/sep/jon/family/jos/gifmovie.html} holds a movieblinking between Figures \ref{fig:512x512} and \ref{fig:64x64}.\sideplot{512x512}{height=3.5in,width=3.5in}{ Jos greets Andrew, ``Welcome back Andrew'' from the Peace Corps. At a resolution of $512\times 512$, this picture is about the same as the resolution as the paper it is printed on, or the same as your viewing screen, if you have scaled it to 50\%\ of screen size.}%\newslide\sideplot{64x64}{height=3.5in,width=3.5in}{ Jos greets Andrew, ``Welcome back Andrew'' again. At a resolution of $64\times 64$ the pixels are clearly visible. From far the pictures are the same. From near, examine their glasses.}%\newslide\parThis cost analysis is oversimplified in that most applicationsdo not require dense operators.With sparse operators, the cost advantage of adjoints is even morepronounced since for adjoints,the cost savings of operator sparseness translate directly toreal cost savings.The situation is less favorable and much more muddy for inversion.The reason that Chapter 2 covers iterative methodsand neglects exact methods is that in practiceiterative methods are not run to their theoretical completionbut they run until we run out of patience.\parCost is a big part of the story, but the story has many other parts.Inversion, while being the only logical path to the best answer,is a path littered with pitfalls.The first pitfall is that the data is rarely able to determine a complete solution reliably.Generally there are aspects of the image that are not learnablefrom the data.\parIn this chapter we study the simplest, most transparant exampleof data insufficiency.Data exists at irregularly spaced positions in a plane.We set up a cartesian mesh and we discover that someof the bins contain no data points.What then?%Having developed optimization methodology%in Chapters \ref{ajt/paper:ajt} and \ref{lsq/paper:lsq},%we now use it in more applications.%First we cover basic concepts of \bx{missing data}%illustrating them with tiny one-dimensional toys.%Then we jump to some two-dimensional real-data sets%where ``missing data'' really means ``empty bins.''%Finally we return to 1-D synthetic data sampled irregularly.%\sx{irregular spacing}\section{MISSING DATA IN ONE DIMENSION}\inputdir{miss1figs}\parA method for restoring \bx{missing data} is to ensure that the restored data,after specified filtering, has minimum energy.Specifying the filter chooses the interpolation philosophy.Generally the filter is a \bx{roughening} filter.\sx{filter ! roughening}When a roughening filter goes off the end of smooth data,it typically produces a big end transient.Minimizing energy implies a choice for unknown data valuesat the end, to minimize the transient.We will examine five cases and then make some generalizations.\par\boxit{A method for restoring missing data is to ensure that the restored data, after specified filtering, has \protect\bx{minimum energy}.}\parLet $u$ denote an unknown (missing) value.The dataset on which the examples are based is$(\cdots, u, u,$ $1, u, 2, 1, 2, u, u, \cdots )$.Theoretically we could adjust the missing $u$ values (each different)to minimize the energy in the unfiltered data.Those adjusted values would obviously turn out to be all zeros.The unfiltered data is data that has been filtered byan impulse function.To find the missing valuesthat minimize energy out of other filters,we can use subroutine \texttt{mis1()} \vpageref{lst:mis1}.Figure~\ref{fig:mlines}shows interpolation of the dataset with $(1,-1)$ as a roughening filter.The interpolated data matches the given data where they overlap.%\sideplot{mlines}{width=2.2in}{ Top is given data. Middle is given data with interpolated values. Missing values seem to be interpolated by straight lines. Bottom shows the filter $(1,-1)$, whose output has minimum energy.}\sideplot{mparab}{width=2.2in}{ Top is the same input data as in Figure~\protect\ref{fig:mlines}. Middle is interpolated. Bottom shows the filter $(-1,2,-1)$. The missing data seems to be interpolated by parabolas.}\sideplot{mseis}{width=2.2in}{ Top is the same input. Middle is interpolated. Bottom shows the filter $(1,-3,3,-1)$. The missing data is very smooth. It shoots upward high off the right end of the observations, apparently to match the data slope there.}%\ACTIVESIDEPLOT{msmo}{width=2.2in}{miss1figs}{% The filter \protect\linebreak[2]% $(-1,$\protect\linebreak[2]$-1,$\protect\linebreak[2]$4,$% \protect\linebreak[2]$-1,$\protect\linebreak[2]$-1)$ gives%%linebreaks are kludgy way to fix an line breaking bug % interpolations with stiff lines. They resemble% the straight lines of Figure~\protect\FIG{mlines},% but they project through a cluster of given values% instead of projecting to the nearest given value.% Thus, this interpolation tolerates noise in the given data% better than the interpolation shown in% Figure~\protect\FIG{mseis}.% }\sideplot{moscil}{width=2.2in}{ Bottom shows the filter $(1,1)$. The interpolation is rough. Like the given data itself, the interpolation has much energy at the Nyquist frequency. But unlike the given data, it has little zero-frequency energy.}\parFigures~\ref{fig:mlines}--\ref{fig:moscil}illustrate that the rougher the filter,the smoother the interpolated data,and vice versa.Let us switch our attention from the residual spectrumto the residual itself.The residual for Figure~\ref{fig:mlines}is the {\it slope} of the signal(because the filter $(1,-1)$ is a {\it first derivative}),and the slope is constant (uniformly distributed) along the straight lineswhere the least-squares procedure is choosing signal values.So these examples confirm the ideathat the \bx{least-squares method} abhors large values(because they are squared).Thus, least squares tends to distribute residuals uniformlyin both time and frequency to the extentallowed by the \bx{constraint}s.\parThis idea helps us answer the question,what is the best filter to use?It suggests choosingthe filter to have an amplitude spectrumthat is inverse to the spectrum we want for the interpolated data.A systematic approach is given in chapter \ref{mda/paper:mda},but I offer a simple subjective analysis here:Looking at the data, we see that all points are positive.It seems, therefore, thatthe data is rich in low frequencies;thus the filter should contain something like $(1,-1)$,which vanishes at zero frequency.Likewise, the data seems to contain Nyquist frequency,so the filter should contain $(1,1)$.The result of using the filter $(1,-1)\ast (1,1)=(1,0,-1)$is shown in Figure~\ref{fig:mbest}.This is my best subjective interpolationbased on the idea that the missing data should look like the given data.The \bx{interpolation} and \bx{extrapolation}s are so good thatyou can hardly guess which data values are givenand which are interpolated.\sideplot{mbest}{width=2.2in}{ Top is the same as in Figures~\protect\ref{fig:mlines} to \protect\ref{fig:moscil}. Middle is interpolated. Bottom shows the filter $(1,0,-1)$, which comes from the coefficients of $(1,-1)\ast (1,1)$. Both the given data and the interpolated data have significant energy at both zero and Nyquist frequencies.}\subsection{Missing-data program}\sx{missing data}Now let us see how Figures \ref{fig:mlines}-\ref{fig:mbest}could have been calculated and how they were calculated.They could have been calculated with matrices,in which matrices were pulled apart according to subscripts of knownor missing data;instead I computed them with operators,and applied only operators and their adjoints.First we inspect the matrix approachbecause it is more conventional.\subsubsection{Matrix approach to missing data}\sx{missing data}\parCustomarily, we have referred to data by the symbol $\bold d$.Now that we are dividing the data space into two parts,known and unknown (or missing),we will refer to this complete spaceas the model (or map) space $\bold m$.\parThere are 15 data points in Figures \ref{fig:mlines}-\ref{fig:mbest}.Of the 15, 4 are known and 11 are missing.Denote the known by $k$ and the missing by $u$.Then the sequence of missing and knownis $(u,u,u,u,k,u,k,k,k,u,u,u,u,u,u)$.Because I cannot print $15\times 15$ matrices,please allow me to describe instead a data space of 6 values$(m_1, m_2, m_3, m_4, m_5, m_6)$ with known values only $m_2$ and $m_3$,that is arranged like $(u,k,k,u,u,u)$.\parOur approach is to minimize the energy in the residual,which is the filtered map (model) space.We state the fitting goals$\bold 0\approx \bold F\bold m$ as\begin{equation} \left[ \begin{array}{c} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{array} \right] \quad \approx \quad\bold r\quad =\quad\left[ \begin{array}{cccccc} a_1 & 0 & 0 & 0 & 0 & 0 \\ a_2 & a_1 & 0 & 0 & 0 & 0 \\ a_3 & a_2 & a_1 & 0 & 0 & 0 \\ 0 & a_3 & a_2 & a_1 & 0 & 0 \\ 0 & 0 & a_3 & a_2 & a_1 & 0 \\ 0 & 0 & 0 & a_3 & a_2 & a_1 \\ 0 & 0 & 0 & 0 & a_3 & a_2 \\ 0 & 0 & 0 & 0 & 0 & a_3 \end{array} \right] \; \left[ \begin{array}{c} m_1 \\ m_2 \\ m_3 \\ m_4 \\ m_5 \\ m_6 \end{array} \right]\label{eqn:headache}\end{equation}We rearrange the above fitting goals,bringing the columns multiplying known data values($m_2$ and $m_3$) to the left,getting $\bold y =-\bold F_k \bold m_k \approx \bold F_u \bold m_u$.\begin{equation} \left[ \begin{array}{c} y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6 \\ y_7 \\ y_8 \end{array} \right] \quad = \quad -\left[ \begin{array}{cc} 0 & 0 \\ a_1 & 0 \\ a_2 & a_1 \\ a_3 & a_2 \\ 0 & a_3 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{array} \right] \; \left[ \begin{array}{c} m_2 \\ m_3 \end{array} \right]\quad \approx \quad\left[ \begin{array}{cccc} a_1 & 0 & 0 & 0 \\ a_2 & 0 & 0 & 0 \\ a_3 & 0 & 0 & 0 \\ 0 & a_1 & 0 & 0 \\ 0 & a_2 & a_1 & 0 \\
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -