⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 paper.tex

📁 国外免费地震资料处理软件包
💻 TEX
📖 第 1 页 / 共 5 页
字号:
\subsection{Internal boundaries to multidimensional convolution }\sx{convolution ! two-dimensional}\parSometimes we deal with small patches of data.In order that boundary phenomena not dominatethe calculation intended in the central region,we need to take care thatinput data is not assumed to be zerobeyond the interval that the data is given.\parThe two little triangular patches of zeros inthe convolution matrix in equation (\ref{eqn:exmiss})describe end conditions where it is assumed that the data $y_t$vanishes before $t=1$ and after $t=6$.Alternately we might not wish to make that assumption.Thus the triangles filled with zeros could be regarded as missing data.In this one-dimensional example,it is easy to see that the filter, say\verb#yy->mis#should be set to \texttt{true} at the ends so no outputwould ever be computed there.We would like to find a general multidimensional algorithmto correctly specify\verb#yy->mis#around the multidimensional boundaries.This proceeds like the missing data algorithm,i.e. we apply a filter of all ones to a data space template that is takenall zeros except ones at the locations of missing data,in this case $y_0,y_{-1}$ and $y_7,y_8$.This amounts to surrounding the original data set with some missing data.We need padding the size of the filter on all sides.The padded region would be filledwith ones (designating missing inputs).Where the convolution output is nonzero, there\verb#yy->mis# is set to \texttt{true}denoting an output with missing inputs.\inputdir{XFig}\parThe two-dimensional case is a little more cluttered than the 1-D casebut the principle is about the same.Figure \ref{fig:rabdomain} shows a larger input domain,a $5\times 3$ filter, and a smaller output domain.\sideplot{rabdomain}{width=1.5in}{  Domain of inputs and outputs of a two-dimensional filter like a PEF.}There are two things to notice.First, sliding the filter everywhere inside the outer box,we get outputs (under the 1 location) only in the inner box.Second, (the adjoint idea) crosscorrelating the inner and outer boxes givesus the $3\times 5$ patch of informationwe use to build the filter coefficients.We need to be careful not to assumethat signals vanish outside the region where they are defined.In a later chapter we will break data spaces into overlapping patches,separately analyze the patches, and put everything back together.We do this because crosscorrelations change with timeand they are handled as constant in short time windows.There we must be particularly careful that zero signal values not be presumedoutside of the small volumes;otherwise the many edges and faces of the many small volumescan overwhelm the interior that we want to study.\parIn practice, the input and output are allocated equal memory,but the output residual is initialized to zero everywhereand then not computedexcept where shown in figure \ref{fig:rabdomain}.Below is module \texttt{bound}to build a selector for filter outputs that should never be examined or even computed(because they need input data from outside the given data space).Inputs are a filter \texttt{aa}and the size of its cube \texttt{na = (na(1),na(2),...)}.Also input are two cube dimensions,that of the data last used by the filter \texttt{nold} andthat of the filter's next intended use \texttt{nd}.(\texttt{nold} and \texttt{nd} are often the same).Module \texttt{bound}begins by defining a bigger data space with room for a filtersurrounding the original data space \texttt{nd} on all sides.It does this by the line \texttt{nb=nd+2*na}.Then we allocate two data spaces\texttt{xx} and \texttt{yy} of the bigger size \texttt{nb}and pack many onesin a frame of width \texttt{na} around the outside of \texttt{xx}.The filter \texttt{aa} is also filled with ones.The filter \texttt{aa} must be regridded for the bigger \texttt{nb}data space (regridding merely changes the lag values of the ones).Now we filter the input \texttt{xx} with \texttt{aa} getting \texttt{yy}.Wherever the output is nonzero,we have an output that has been affected by the boundary.Such an output should not be computed.Thus we allocate the logical mask \verb#aa->mis#(a part of the helix filter definitionin module \texttt{helix} \vpageref{lst:helix})and wherever we see a nonzero value of \texttt{yy}in the output,we designate the output as depending on missing inputs by setting\verb#aa->mis# to \texttt{true}.\moddex{bound}{out of bounds dependency}{30}{74}{user/gee}\parIn reality one would set up the boundary conditionswith module\texttt{bound}before identifying locations of missing datawith module\texttt{misinput}.Both modules are based on the same concept,but the boundaries are more cluttered and confusingwhich is why we examined them later.\subsection{Finding the prediction-error filter}\sx{filter ! prediction-error}\parThe first stage of the least-squares estimationis computing the\bx{prediction-error filter}.The second stage will be using it to find the missing data.The input data space contains a mixture of known data valuesand missing unknown ones.For the first stage of finding the filter,we generally have many more fitting equations than we needso we can proceed by ignoring the fitting equationsthat involve missing data values.We ignore them everywhere that the missing inputs hit the filter.\parThe codes here do not address the difficulty that maybetoo much data is missing so that all weights are zero.To add stabilization we could supplement the data volumewith a ``training dataset''or by a ``prior filter''.With things as they are,if there is not enough data to specify a prediction-error filter,you should encounter the error exit from \texttt{cgstep()} \vpageref{lst:cgstep}.\moddex{pef}{estimate PEF on a helix}{28}{38}{user/gee}%Results are shown in Figures%\ref{fig:wood-hole}- \ref{fig:brick-hole}.%Here again, the PEF is carefully designed using%module \texttt{bound} \vpageref{lst:bound}%so that no filter outputs are used where the filter%is only partway on the data.%After the PEF was designed, it is applied with the%more cavalier helix treatment of boundaries%so although you cannot see the frame of excluded outputs,%you can see what happens as the filter is climbing onto the data.%\activeplot{wood-hole}{width=6.0in,height=1.75in}{} {%	Hole in wood.%	}%\activeplot{herr-hole}{width=6.0in,height=1.75in}{} {%	Hole in herringbone.%	}%\activeplot{brick-hole}{width=6.0in,height=1.75in}{} {%	Hole in brick.%	}\section{TWO-STAGE LINEAR LEAST SQUARES}\sx{Two-stage linear least squares}\sx{two-stage ! least squares}In Chapter \ref{iin/paper:iin} andChapter    \ref{prc/paper:prc}we filled empty binsby minimizing the energy output from the filtered mesh.In each case there was arbitrariness in the choice of the filter.Here we find and use the optimum filter, the PEF.\par%We will develop code in which \bx{missing data}%in a training dataset%is restored by a two-stage linear least squares process.%Since we are using a helix,%the training volume could be a line, a plane, a volume, or a hypervolume.%In the first stage,%we fit a prediction-error filter (PEF) to the given plane.%Fitting equations that involve empty bins%are weighted to zero%so they have no effect on the answer, the PEF.The first stage is that of the previous section,finding the optimal PEF whilecarefully avoiding using any regression equationsthat involve boundaries or missing data.For the second stage, we take the PEF as known andfind values for the empty bins so thatthe power out of the prediction-error filter is minimized.To do this we find missing data withmodule \texttt{mis2()} \vpageref{lst:mis2}.\parThis two-stage method avoids the nonlinear problemwe would otherwise face if we included the fittingequations containing both free data values and free filter values.Presumably, aftertwo stages of linear least squareswe are close enough to the final solutionthat we could switch over to the full nonlinear setupdescribed near the end of this chapter.\inputdir{hole}\parThe synthetic data in Figure \ref{fig:hole}is a superposition of two plane waves of different directions,each with a random (but low-passed) waveform.After punching a hole in the data,we find that the lost data is pleasingly restored,though a bit weak near the side boundary.This imperfection could resultfrom the side-boundary behavior of the operatoror from an insufficient number of missing-data iterations.\plot{hole}{width=6in,height=2.5in}{  Original data (left), with a zeroed hole, restored,  residual selector (right).}\parThe residual selector in Figure \ref{fig:hole}shows where the filter output has valid inputs.From it you can deduce the size and shape of the filter,namely that it matches up with Figure \ref{fig:rabdomain}.The ellipsoidal hole in the residual selector is largerthan that in the data because we lose regression equationsnot only at the hole,but where any part of the filter overlaps the hole.\parThe results in Figure \ref{fig:hole} are essentially perfectrepresenting the fact that that synthetic examplefits the conceptual model perfectly.Before we look at the many examplesin Figures\ref{fig:herr-hole-fillr}-\ref{fig:WGstack-hole-fillr}we will examine another gap-filling strategy.\subsection{Adding noise (Geostat)}\inputdir{morgan}\sx{geostatistics}In chapter \ref{iin/paper:iin} we restored missing databy adopting the philosopy of minimizing the energy in filtered output.In this chapter we learned about an optimum filterfor this task, the prediction-error filter (PEF).Let us name this method the ``minimum noise'' methodof finding missing data.\parA practical problem with the minimum-noise method is evidentin a large empty hole such as in Figures \ref{fig:herr-hole-fillr}-\ref{fig:brick-hole-fillr}.In such a void the interpolated data diminishes greatly.Thus we have not totally succeeded in the goal of``hiding our data acquisition footprint''which we would like to do if we are trying to makepictures of the earth and not pictures of ourdata acquisition footprint.\parWhat we will do next is useful in some applications but not in others.Misunderstood or misused it is rightly controversial.We are going to fill the empty holeswith something that looks like the original data but really isn't.I will distinguish the words ``\bx{synthetic data}''(that derived from a physical model)from ``\bx{simulated data}'' (that manufactured from a statistical model).We will fill the empty holes with simulated datalike what you see in the center panels of Figures\ref{fig:granite}-\ref{fig:WGstack}.We will add just enough of that ``wall paper noise'' to keepthe variance constant as we move into the void.\parGiven some data $\bold d$, we use it in a filter operator $\bold D$,and as described with equation (\ref{eqn:exmiss}) we builda weighting function $\bold W$ that throws out thebroken regression equations (ones that involve missing inputs).Then we find a PEF $\bold a$ by using this regression.\begin{equation}	\bold 0\quad\approx\quad	\bold r\quad = \quad	\bold W \bold D \bold a\end{equation}Because of the way we defined $\bold W$,the ``broken'' components of $\bold r$ vanish.We need to know the variance $\sigma$ of the nonzero terms.It can be expressed mathematically in a couple different ways.Let $\bold 1$ be a vector filled with ones and let$\bold r^2$ be a vector containing the squares of the components of $\bold r$.\begin{equation}\sigma\quad = \quad	\sqrt{ {1\over N} \sum_i^N \,r_i^2 }\quad = \quad	\sqrt{		\bold 1' \bold W \bold r^2	\over		\bold 1' \bold W \bold 1 	}\end{equation}Let us go to a random number generatorand get a noise vector $\bold n$filled with random numbers of variance $\sigma$.We'll call this the ``added random noise''.Now we solve this new regression forthe data space $\bold d$ (both known and missing)\begin{equation}\bold 0\quad\approx\quad\bold r\quad = \quad\bold A \bold d   \ - \  \bold n\label{eqn:noiseeverywhere}\end{equation}\par\noindentkeeping in mind that known data is constrained(as detailed in chapter \ref{iin/paper:iin}).\parTo understand why this works,consider first the training image, a region of known data.Although we might think that the data defines the white noiseresidual by $\bold r=\bold A\bold d$, we can also imagine that the white noisedetermines the data by $\bold d=\bold A^{-1}\bold r$.Then consider a region of wholly missing data.  This datais determined by $\bold d=\bold A^{-1}\bold n$.Since we want the data variance to be the same in known and unknownlocations, naturally we require the variance of $\bold n$to match that of $\bold r$.\parA very minor issue remains.Regression equations may have all of their required input data,some of it, or none of it.Should the $\bold n$ vector add noise to every regression equation?First, if a regression equation has all its input datathat means there are no free variables so it doesn't matterif we add noise to that regression equation because the constraintswill overcome that noise.I don't know if I should worry about how{\it many}inputs are missing for each regression equation.%We could choose to add noise only to those regression equations%that have all their inputs.%Then we'd be fitting%$\bold 0 \approx\ \bold r = \bold A \bold d    - (\bold I -\bold W)  \bold n$.%I feel, however, that equation%(\ref{eqn:noiseeverywhere}) would be slightly better.%The difference would be most apparent when the PEF was allocated%more space than it required.%Then we might see a halo of dimness just inside the rim of the void.\parIt is fun making all this interesting ``wall paper''noticing where it is successful and where it isn't.We cannot help but notice that it seems to work better withthe genuine geophysical data thanit does with many of the highly structured patterns.Geophysical data is expensive to acquire.Regrettably, we have uncovered a technolo

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -