fftcep.1

来自「speech signal process tools」· 1 代码 · 共 538 行 · 第 1/2 页

1
538
字号
.\" Copyright (c) 1990 Entropic Speech, Inc..\" Copyright (c) 1996 Entropic Research Laboratory, Inc. All rights reserved..\" @(#)fftcep.1	1.5 01 Oct 1998 ESI/ERL.ds ]W (c) 1993 Entropic Research Laboratory, Inc..TH FFTCEP 1\-ESPS 01 Oct 1998.SH "NAME"fftcep \- FFT-based complex cepstrum of ESPS sampled data .SH "SYNOPSIS".B fftcep [.BI \-P " param"] [.BI \-r " range"] [ .BI \-l " frame_len"] [.BI \-S " step"] [.BI \-w " window_type"] [.BI \-o " order"] [.BI \-F] [.BI \-R] [.BI \-e " element_range"][.BI \-z " zeroing_range"][.BI \-x " debug_level"] .I sd_file .I fea_file.SH "DESCRIPTION".PP.I fftcep takes an input ESPS sampled data (SD or FEA_SD) file,\fIsd_file\fP, and finds the complex cepstrum of one or more fixed lengthsegments to produce an ESPS FEA file, .I fea_file.  If the input file name .I sd_fileis replaced by "\-", stdin is read; similarly, if .I fea_fileis replaced by "\-", the output is written to stdout.The \fIFEA_SD\fP(5\-\s-1ESPS\s+1) files support complexsampled data, and \fIfftcep\fP will process complex input data,as well as multichannel data..PPThe complex cepstrum of a single frame is computed by obtaining theFFT of the (possibly windowed) data frame, computing the complexlogarithm of this spectrum, and finding the inverse FFT of the logspectral data. The complex cepstrum of single channel input is stored in the field \fIcepstrum_0\fP whose default data type is FLOAT_CPLX.  If the input has \fIN\fP channels, the cepstraldata corresponding to the \fPith\fP channel is stored in field\fIcepstrum_i\fP, where 0 <= \fIi\fP < \fIN\fP; the data in eachchannel is processed identically..PPThe \fB\-R\fP option specifies that the cepstrum rather than the complexcepstrum should be computed. In this case, the inverse FFT is performed onthe log magnitude of the spectrum, rather than on the complexlogarithm of the spectrum. Under the \fB\-R\fP option, the cepstrum ofsingle channel data is stored in the field \fIcepstrum_0\fP and its default data type is FLOAT_CPLX. Field names for multichannel data are constructedas in the complex cepstrum case, and the data in each channelis processed identically..PPWhen computing either the cepstrum or complex cepstrum, the option \fB\-F\fP specifies that the imaginarypart of the resulting data be discarded and only the real part be stored; the field \fIcepstrum_0\fP then has type FLOAT..PPAll input frames have the same length.I frame_len(see the \fB\-l\fP option).  Theinitial point of the first frame is determined by the \fB\-r\fP option or by.I startin the parameter file.  Initial points of subsequent frames follow at equal interval .I step(see \fB\-S\fP option).  Thus the 3 cases .I step < .I frame_len, .I step = .I frame_len, .I step > .I frame_len, correspond to overlapping frames, exactly abutted frames, and frames separated bygaps..PPThe number of frames is the minimum sufficient to cover a specified range of.I nanpoints (see \fB\-r\fP option), given .I frame_len and .I step.  The last frame mayoverrun the range, in which case a warning is issued.  If a frame overrunsthe end of the input file, it is filled out with zeros..PPThe FFT cepstral routines used by \fIfftcep\fP return2^\fIorder\fP values (see \fIfft_cepstrum\fP(3\-\s-1ESPS\s+1)).  The defaultis always to store all 2^\fIorder\fP values in \fIcepstrum_0\fP in the same order as returned by \fIget_cfft_inv\fP(3-ESPS); the cepstral sequence isreturned in the following order:c(0), c(1),..., c(N/2), c(-(N/2) + 1), c(-(N/2) + 2),..., c(-1).It is possible to specify that a subrange of these valuesbe used to form \fIcepstrum_0\fP (see \fB\-e\fP option).This makes it possible to discard cepstral information not relevantto further processing.  .PPIt is possible to perform simple filtering on the cepstral data.The \fB\-z\fP option can be used to set elements of the cepstral data to zero. To apply more complicated operations to the cepstral data, \fBfeafunc\fP(1-ESPS) can beused to process the field \fIcepstrum_0\fP directly and \fBmake_sd\fP(1-ESPS) can be used to create \fIFEA_SD\fP(5-ESPS) files which can be processed with \fBwindow\fP (1-ESPS)..PP\fBExample Shell Script\fP.brThe following shell script is an example of using \fIfftcep\fPto analyze a segment of speech.  The FFT spectrum as found by\fIfft\fP(1\-\s-1ESPS\s+1) can be compared to the spectrumof the liftered cepstral sequence computed by \fIfftcep\fP.  A 512 point segment is read from thefile \fI/usr/esps/demo/speech.sd\fP by \fIfft\fP and \fIfftcep\fP.Both programs compute a 1024 point FFT from the Hamming windowedsequence, and \fIfftcep\fP is forced to compute and store the realpart of the power cepstrum.  Liftering is performed by using the\fB\-z\fP option to set the long-time cepstral components to zero.The program \fImake_sd\fP(1\-\s-1ESPS\s+1) must be used to translatethe FEA file output of \fIfftcep\fP into the FEA_SD file formatexpected by \fIfft\fP.The program \fIplotspec\fP(1\-\s-1ESPS\s+1) can be used to comparethe FFT spectrum with the spectrum of the liftered cepstral sequence..PP#!/bin/csh.brset spfile = /usr/esps/demo/speech.sd.brset sf = `hditem -i record_freq $spfile`.br#.brfft -r1000:1511 -o10 -l512 -wHAMMING $spfile speech.spec .brplotspec -p0:4000 speech.spec.br#.brfftcep -r1000:1511 -o10 -l512 -wHAMMING -F -R -z23:1000 \\.br	$spfile speech.cep .brmake_sd -r1: -fcepstrum_0 -S$sf speech.cep speech.cepsd.brfft -p1:3072 -o10 -l1024 speech.cepsd speech.cspec.brplotspec -p0:4000 speech.cspec.PP.SH "OPTIONS".PPThe following options are supported:.TP.BI \-P " param"uses the parameter file .I paramrather than the default, which is.I params..TP.BI \-r " first" : "last".TP.BI \-r " first" \- "last".TP.BI \-r " first" :+ "incr"In the first two forms, a pair of unsigned integers specifies the range ofsampled data to analyze.  If .IR last " = " first " + " incr,the third form (with the plus sign) specifies the same range as thefirst two forms.  If .I firstis omitted, the default value of 1 is used.  If.I lastis omitted, the range extends to the end of the file.If the specified range extends beyond the end of the file,it is reduced to end at the end of the file.Then, if the range doesn't end on a frame boundary,it is extend to make up a full last frame.If the range, so extended, goes past the end of the file,the last frame is filled out with zeros.All forms of the option override the values of .I startand.I nanin the parameter file or ESPS Common file.The first two forms are equivalent to supplying values of.I firstfor the parameter.I startand.RI ( last " + 1 \- " first )for the parameter.I nan.The third form is equivalent to values of.I firstfor.I startand.RI ( incr " + 1)"for.I nan.If the \fB\-r\fP option is notused, the range is determined from the ESPS Parameter or Common file if theappropriate parameters are present..TP.BI \-l " frame_len" "\fR [0]"Specifies the length of each frame.If the option is omitted, the parameter file is consulted.A value of 0 (from either the option or the parameter file)indicates that a default value is to be used:the transform length determined by the order.  (See the.B \-ooption and the.I orderparameter.)This is also the default value in case.I frame_lenis not specified either with the.B \-loption or in the parameter file..TP.BI \-S " step" "\fR [" frame_len "\fR]"Initial points of consecutive frames differ by this number of samples.If the option is omitted, the parameter file is consulted,and if no value is found there, a default equal to.I frame_lenis used (resulting in exactly abutted frames).The same default applies if \fIstep\fP is given a value of 0..TP.BI \-w " window_type" " \fR[RECT]"The name of the data window to apply to the data in each frame beforecomputing the FFT.  If the option is omitted, the parameterfile is consulted, and if no value is found there, the default used is arectangular window with amplitude one.  Possible window types includerectangular ("RECT"), Hamming ("HAMMING"), Hanning ("HANNING"),cos^4 ("COS4"), and triangular ("TRIANG"); seethe window (3-ESPS) manual page for the complete list..TP.BI \-o " order" "\fR [10]\fP"The order of the FFT and inverse FFT; the transform length is 2^\fIorder\fP(2 to the \fIorder\fP-th power).  If the number of data points in each frame (frame length) is lessthan the transform length, the frame is padded with zeros (a warning isgiven).  If the number of data points in each frame exceeds the transformlength, only the first 2^\fIorder\fP points from each frame are transformed \-i.e., points are effectively skipped between each transform (a warning isgiven).  The default order is 10 (transform length 1024).

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?