📄 paper.tex
字号:
the fundamental property of interest,``causal with a causal (convolutional) inverse.''I could call it CwCI.An example of a causal filter without a causal inverse is the unitdelay operator --- with $Z$-transforms, the operator $Z$ itself.If you delay something, you can't get it back without seeing into the future,which you are not allowed to do.Mathematically, $1/Z$ cannot be expressed as a polynomial(actually, a convergent infinite series) in positive powers of $Z$.\parPhysics books don't tell us where to expect to findtransfer functions that are CwCI.I think I know why they don't.Any causal filter has a ``sharp edge'' at zero time lag where it switchesfrom nonresponsiveness to responsiveness.The sharp edge might cause the spectrum to be large at infinite frequency.If so, the inverse filter is small at infinite frequency.Either way,one of the two filters is unmanageable with Fourier transform theorywhich(you might have noticed in the mathematical fine print)requires signals (and spectra) to have finite energywhich means the function must get real small in that immense spaceon the $t$-axis and the $\omega$ axis.It is impossible for a function to be small and its inverse be small.These imponderables get more managable in the world of Time Series Analysis(discretized time axis).\subsection{The spectral factorization concept}Interesting questions arise when we are given a spectrumand find ourselves asking how to find a filter that has that spectrum.Is the answer unique? We'll see not.Is there always an answer that is causal?Almost always, yes.Is there always an answer that is causal with a causal inverse (CwCI)?Almost always, yes.\parLet us have an example.Consider a filter like the familiar time derivative $(1,-1)$ exceptlet us downweight the $-1$ a tiny bit, say $(1,-\rho)$ where $0<<\rho<1$.Now the filter $(1,-\rho)$has a spectrum $(1-\rho Z)(1-\rho/Z)$ with autocorrelationcoefficients $(-\rho, 1+\rho^2,-\rho)$ that look a lot like a secondderivative, but it is a tiny bit bigger in the middle.Two different waveforms, $(1,-\rho)$ and its time reverseboth have the same autocorrelation.Spectral factorization could give us both $(1,-\rho)$ and $(\rho,-1)$but we always want the one that is CwCI.The bad one is weaker on its first pulse.Its inverse is not causal.Below are two expressions for the filter inverse to $(\rho,-1)$,the first divergent(filter coefficients at infinite lag are infinitely strong),the second convergent but noncausal.\begin{eqnarray}{1\over \rho -Z} &=& { 1\over\rho}\ ( 1 +Z/\rho +Z^2/\rho^2+ \cdots)\\{1\over \rho -Z} &=& {-1\over Z}\ ( 1 + \rho/Z + \rho^2/Z^2 + \cdots)\end{eqnarray}(Please multiply each equation by $\rho -Z$ and see it reduce to $1=1$).\parSo we start with a power spectrum and we should finda CwCI filter with that energy spectrum.If you input to the filter an infinite sequence of random numbers(white noise)you should output something with the original power spectrum.\parWe easily inverse Fourier transform the square root of the power spectrumgetting a symmetrical time function, butwe need a function that vanishes before $\tau=0$.On the other hand,if we already had a causal filter with the correct spectrumwe could manufacture many others.To do so all we need is a family of delay operators to convolve with.A pure delay filter does not change the spectrum of anything.Same for frequency-dependent delay operators.Here is an example of a frequency-dependent delay operator:First convolve with (1,2) and then deconvolve with (2,1).Both these have the same amplitude spectrum so their ratiohas a unit amplitude (and nontrivial phase).If you multiply $(1+2Z)/(2+Z)$ by its Fourier conjugate(replace $Z$ by $1/Z$) the resulting spectrum is 1 for all $\omega$.\parAnything whose nature is delay is death to CwCI.The CwCI has its energy as close as possible to $\tau=0$.More formally, my first book, FGDP, proves that the CwCI filterhas for all time $\tau$ more energy between $t=0$ and $t=\tau$than any other filter with the same spectrum.\parSpectra can be factorized by an amazingly wide variety of techniques,each of which gives you a different insight into this strange beast.They can be factorized by factoring polynomials, by inserting power seriesinto other power series, by solving least squares problems,by taking logarithms and exponentials in the Fourier domain.I've coded most of them and still find them all somewhat mysterious.\parTheorems in Fourier analysis can be interpreted physically in twodifferent ways, one as given, the other with time and frequency reversed.For example, convolution in one domain amounts to multiplication in the other.If we were to express the CwCI concept with reversed domains,instead of saying the ``energy comes as quick as possible after $\tau=0$''we would say ``the frequency function is as close to $\omega=0$ as possible.''In other words, it is minimally wiggly with time.Most applications of spectral factorization begin with a spectrum,a real, positive function of frequency.I once achieved minor fame by starting with a real, positive function of space,a total magnetic field $\sqrt{H_x^2 +H_z^2}$ measured along the $x$-axisand I reconstructed the magnetic field components $H_x$ and $H_z$that were minimally wiggly in space.\subsection{Cholesky decomposition}Conceptually the simplest computational method of spectral factorizationmight be ``Cholesky decomposition.''For example, the matrix of (\ref{eqn:lapfacmat})could have been found by Cholesky factorization of (\ref{eqn:huge}).The Cholesky algorithm takes a positive-definite matrix$\bold Q$ and factors it into a triangular matrixtimes its transpose,say $\bold Q = \bold T' \bold T$.%Equation (\ref{ajt/eqn:polydiv}) is an example of a banded triangular matrix.\parIt is easy to reinvent the Cholesky factorization algorithm.To do so,simply write all the components of a $3\times 3$ triangular matrix$\bold T$ and then explicitly multiply these elementstimes the transpose matrix $\bold T'$.You will find that you have everything you needto recursively build the elements of $\bold T$from the elements of $\bold Q$.Likewise for a $4\times 4$ matrix, etc.\parThe $1\times 1$ case shows that the Cholesky algorithm requires square roots.Matrix elements are not always numbers.Sometimes they are polynomials such as $Z$-transforms.To avoid square roots there is a variationof the Cholesky method.In this variation, we factor $\bold Q$ into$\bold Q=\bold T'\bold D\bold T$where $\bold D$ is a diagonal matrix.\parOnce a matrix has been factored into upper and lower triangles,solving simultaneous equationsis simply a matter of two backsubstitutions:(We looked at a special case of backsubstitutionwith equation (\ref{ajt/eqn:polydiv}).)For example, we often encounter simultaneous equations of the form$\bold B'\bold B\bold m=\bold B'\bold d$.Suppose the positive-definite matrix$\bold B'\bold B$ has been factored into triangle form$\bold T'\bold T\bold m=\bold B'\bold d$.To find $\bold m$we first backsolve$\bold T'\bold x=\bold B'\bold d$for the vector$\bold x$.Then we backsolve$\bold T\bold m=\bold x$.When$\bold T$happens to be a band matrix,then the first backsubstitution is filtering down a helixand the second is filtering back up it.Polynomial division is a special case of back substitution.\parPoisson's equation$\nabla^2 \bold p = -\bold q$requires boundary conditions which we can honorwhen we filter starting from both ends.We cannot simply solve Poisson's equation asan initial-value problem.We could insert the laplace operatorinto the polynomial division program,but the solution would diverge.\parBeing a matrix method, the Cholesky method of factorizationhas a cost proportional to the cube of the size of the matrix.Because our problems are very largeand because the Cholesky methoddoes not produce a useful result if we stop part way to completion,we look further.The Cholesky method is a powerful method but it does more than we require.The Cholesky method does not require band matrices,yet these matrices are what we very often find in applications,so we seek methods that take advantage of the special propertiesof band matrices.\subsection{Toeplitz methods}Band matrices are often called Toeplitz matrices.In the subject of Time Series Analysis are foundspectral factorization methods that require computationsproportional to the dimension of the matrix squared.They can often be terminated early with a reasonable partial result.Two Toeplitz methods, the Levinson methodand the Burg method are described in my first textbook, FGDP.Our interest is multidimensional data sets sothe matrices of interest are truely huge and the costof Toeplitz methods is proportional to the square of the matrix size.Thus, before we find Toeplitz methodsespecially useful, we may need to findways to take advantage of the sparsity of our filters.\subsection{Kolmogoroff spectral factorization}With Fourier analysis we find a method of spectral factorizationthat is as fast as Fourier transformation,namely $N\log N$ for a matrix of size $N$.This is very appealing.An earlierversion of this book included such an algorithm.Pedagogically, I didn't like it in this book because itrequires lengthy off-topic discussions of Fourier analysiswhich are already found in both my first book FGDP and my third book PVI.\parThe Kolmogoroff calculation is based on the logarithm of the spectrum.The logarithm of zero is minus infinity --- an indicator thatperhaps we cannot factorize a spectrum which becomes zero at any frequency.Actually, the logarithmic infinity is the gentlest kind.The logarithm of the smallest nonzero value in single precisionarithmetic is about $-36$ which might not ruin your average calculation.Mathematicians have shown that the integral of the logarithm ofthe spectrum must be bounded so that some isolated zero valuesof the spectrum are not a disaster. In other words, we can factorthe (negative) second derivative to get the first derivative.This suggests we will never find a causal bandpass filter.It is a contradiction to desire bothcausality and a spectral band of zero gain.\parThe weakness of the Kolmogoroff method is related to its strength.Fourier methods strictly require the matrix to be a band matrix.A problem many people would like to solve is how to handle a matrixthat is ``almost'' a band matrix --- a matrix where any bandchanges slowly with location.\begin{comment}\subsection{Blind deconvolution}A area of applications that leads directly to spectral factorizationis ``blind deconvolution.''Here we begin with a signal.We form its spectrum and factor it.We could simply inspect the filter and interpret it,or we might deconvolve it out from the original data.This topic deserves a fuller exposition, say for exampleas defined in some of my earlier books.Here we inspect a novel example that incorporates the helix.\activeplot{solar}{width=6in,height=3.3in}{NR}{ Raw seismic data on the sun (left). Impulse response of the sun (right) derived by Helix-Kolmogoroff spectral factorization. }\parSolar physicists have learned how to measurethe seismic field of the sun surface.If you created an impulsive explosion on the surface of the sun,what would the response be?James Rickett and I applied the helix idea along with Kolmogoroffspectral factorization to find the impulse response of the sun.Figure \ref{fig:solar} shows a raw data cube and the derived impulse response.The sun is huge so the distance scale is in megameters (Mm).The United States is 5 Mm wide.Vertical motion of the sun is measured with a video-camera like devicethat measures vertical motion by doppler shift.From an acoustic/seismic point of view,the surface of the sun is a very noisy place.The figure shows time in kiloseconds (Ks).We see about 15 cycles in 5 Ks which is 1 cycle in about 333 sec.Thus the sun seems to oscillate vertically with about a 5 minute period.The top plane of the raw datain Figure \ref{fig:solar} (left panel)happens to have a sun spot in the center.The data analysis here is not affected by the sun spot so please ignore it.\parThe first step of the data processing isto transform the raw data to its spectrum.With the helix assumption, computing the spectrum isvirtually the same thing in 1-D space as in 3-D space.The resulting spectrum was passed to Kolmogoroff spectral factorization code.The resulting impulse response is on the right side of Figure \ref{fig:solar}.The plane we see on the right top is not lag time $\tau=0$;it is lag time $\tau=2$ Ks.It shows circular rings, as ripples on a pond.Later lag times (not shown) would be the larger circles of expanding waves.The front and side planes show tent-like shapes.The slope of the tent gives the (inverse) velocity of the wave(as seen on the surface of the sun).The horizontal velocity we see on the sun surface turns out(by Snell's law)to be the same as that at the bottom of the ray.On the front face at early times we see the low velocity (steep) wavefrontsand at later times we see the faster waves.This is because the later arrivals reach more deeply into the sun.Look carefully, and you can see two (or even three!) tents inside one another.These ``inside tents'' are the waves that have bounced once (or more!)from the surface of the sun.When a ray goes down and back up to the sun surface,it reflects and takes off again with the same ray shape.The result is that a given slope on the traveltime curvecan be found again at twice the distance at twice the time.\end{comment}\section{WILSON-BURG SPECTRAL FACTORIZATION}(If you are new to this material, you should pass over this section.)Spectral factorization is the job of taking a power spectrumand from it finding a causal (zero before zero time) filter with that spectrum.Methods for this task (there are many)not only produce a causal wavelet,but they typically produce one whoseconvolutional inverse is also causal.(This is called the ``minimum phase'' property.)In other words, with such a filter we can do stable deconvolution.HereI introduce a new method of spectral factorizationthat looks particularly suitable for the task at hand.I learned this new method from John Parker Burg whoattributes it to an old paper by Wilson(I find Burg's explanation, below, much clearer than Wilson's.)\par%\begin{notforlecture}\begin{comment}Below find subroutine \texttt{lapfac()} whichwas used in the previous sectionto factor the Laplacian operator.\end{comment}To invoke the factorization subroutine,you need to supply one side of an autocorrelation function.For example, let us specify the negative of the 2-D Laplacian(an autocorrelation)in a vector {\tt n = }$256\times 256$ points long.\par\noindent\footnotesize\begin{verbatim} rr[0] = 4. rr[1] = -1. rr[256] = -1.\end{verbatim}\normalsize%\par\noindent%\begin{comment}%The reason for the%$4.00004$ instead of $4.0$
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -