📄 ch31.htm

📁 《Perl 5 Unreleased》
💻 HTM
📖 第 1 页 / 共 5 页
字号:
24 41<BR>

21&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;semimajorAxis&nbsp;&nbsp;&nbsp;&nbsp;double

42 53&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;12.3<BR>

22&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conversionFactor double 66 77&nbsp;&nbsp;&nbsp;&nbsp;12.8

<BR>

23&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;inverseFlattening double 66 77&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;12.7

<BR>

24 ENDREC<BR>

25<BR>

26 RECORD E3100<BR>

27&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;velprop&nbsp;&nbsp;double 6 12

7.2<BR>

28&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;REPEAT&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5&nbsp;&nbsp;13

26 39 52 65<BR>

29&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;srcNdx&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;int

13 15<BR>

30&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dstNdx&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;int

16 18<BR>

31&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;slant&nbsp;&nbsp;&nbsp;&nbsp;double

19 25 7.2<BR>

32 ENDREC</FONT></TT>

</BLOCKQUOTE>

<HR>

<H2><A NAME="ParsingRecords"><B><FONT SIZE=5 COLOR=#FF0000>Parsing

Records</FONT></B></A></H2>

<P>

Now that we have an input file, let's tackle parsing the records

within this file. The most likely way to tackle this problem is

to generate three files from the input file: one header, one file

with all source code for the decoder, and one source file for

the encoder.

<P>

The pseudocode looks something like this:

<BLOCKQUOTE>

<TT><I><FONT FACE="Courier">open file for input<BR>

open files for output<BR>

while&nbsp;&nbsp;(more records)<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></I><FONT FACE="Courier">&nbsp;<I>if

recognized start of record<BR>

&nbsp;&nbsp;&nbsp;&nbsp;start structure definitions<BR>

&nbsp;&nbsp;&nbsp;&nbsp;start encoder function preamble<BR>

&nbsp;&nbsp;&nbsp;&nbsp;start decoder function preamble<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</I>&nbsp;<I>if recognized

end of record<BR>

&nbsp;&nbsp;&nbsp;&nbsp;terminate structure definitions<BR>

&nbsp;&nbsp;&nbsp;&nbsp;terminate encoder function preamble<BR>

&nbsp;&nbsp;&nbsp;&nbsp;terminate decoder function preamble<BR>

&nbsp;&nbsp;&nbsp;&nbsp;if within record<BR>

&nbsp;&nbsp;&nbsp;&nbsp;generate structure variable definitions

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;generate encoder function parsing for

variable<BR>

&nbsp;&nbsp;&nbsp;&nbsp;generate decoder function parsing for

variable<BR>

close all files</I></FONT></TT>

</BLOCKQUOTE>

<P>

Perl has the capability to have more than one file open at once.

By running the input file through one parser, you can generate

three files simultaneously. All you do is send the output to its

respective file. In this case, three files are opened for output:

<TT><FONT FACE="Courier">HDRS</FONT></TT> for the header declarations,

<TT><FONT FACE="Courier">EncD</FONT></TT> for the encoding output,

and <TT><FONT FACE="Courier">DECD</FONT></TT> for the decoding

output. The <TT><FONT FACE="Courier">SAFE</FONT></TT> handle is

used to read in the input records from the <TT><FONT FACE="Courier">P286hdrs</FONT></TT>

file. The lines to open the file are

<BLOCKQUOTE>

<TT><FONT FACE="Courier">open (SAFE, &quot;P286hdrs&quot;) ||

die &quot;Cannot open Input file&nbsp;&nbsp;$!\n&quot;;<BR>

open (HDRS, &quot;&gt;P286.h&quot;)&nbsp;&nbsp;|| die &quot;Cannot

open&nbsp;&nbsp;Header $!\n&quot;;<BR>

open (EncD, &quot;&gt;P286enc.c&quot;)&nbsp;&nbsp;|| die &quot;Cannot

open&nbsp;&nbsp;Encoder $!\n&quot;;<BR>

open (DECD, &quot;&gt;P286dec.c&quot;)&nbsp;&nbsp;|| die &quot;Cannot

open&nbsp;&nbsp;Decoder $!\n&quot;;</FONT></TT>

</BLOCKQUOTE>

<P>

After the files are opened, some preamble stuff is required for

each source and header file. The calls to the these functions

provide the initialization. The contents of each file are destroyed

when the files are opened, so you have to initialize each file:

<BLOCKQUOTE>

<TT><FONT FACE="Courier">&amp;startHeaderFile();<BR>

&amp;startEncoderFile();<BR>

&amp;startDecoderFile();</FONT></TT>

</BLOCKQUOTE>

<P>

Then, a <TT><FONT FACE="Courier">while</FONT></TT> loop simply

reads in all the input, one record at a time.

<P>

After chopping off the terminating newline, the incoming line

is examined to see whether there are any comments or if it's a

blank line. If either case is true, the line is discarded. Look

at the following code:

<BLOCKQUOTE>

<TT><FONT FACE="Courier">while (&lt;SAFE&gt;) {<BR>

&nbsp;&nbsp;&nbsp;&nbsp;chop($_);<BR>

&nbsp;&nbsp;&nbsp;&nbsp;if (/^#/) { next; }<BR>

&nbsp;&nbsp;&nbsp;&nbsp;if (/^\s*$/) { next; }</FONT></TT>

</BLOCKQUOTE>

<P>

If the line appears to be non-empty, it's split into the <TT><FONT FACE="Courier">@names</FONT></TT>

array and examined for the tokens <TT><FONT FACE="Courier">RECORD</FONT></TT>,

<TT><FONT FACE="Courier">ENDREC</FONT></TT>, and <TT><FONT FACE="Courier">REPEAT</FONT></TT>.

The default case is to process variable types and generate either

structure variable declarations or code for encoding and decoding

their values.

<P>

If it's a <TT><FONT FACE="Courier">RECORD</FONT></TT> token, a

new declaration is started for a structure. The name of the record

is in <TT><FONT FACE="Courier">$rname</FONT></TT>, with the <TT><FONT FACE="Courier">$rtype</FONT></TT>

as the type of record. Note how the <TT><FONT FACE="Courier">$_</FONT></TT>

is used twice when splitting the record. The value of <TT><FONT FACE="Courier">$_</FONT></TT>

is not modified with a call to the <TT><FONT FACE="Courier">split()</FONT></TT>

function nor is any function called that will modify the value

of <TT><FONT FACE="Courier">$_</FONT></TT>. The three functions,

<TT><FONT FACE="Courier">&amp;startHeaderRecord($rtype)</FONT></TT>,

<TT><FONT FACE="Courier">&amp;startEncoderFunction($rtype)</FONT></TT>,

and <TT><FONT FACE="Courier">&amp;startDecoderFunction($rtype)</FONT></TT>

take the P286 record type and generate a header declaration, an

encoder function preamble, and a decoder function preamble. We

also mark the fact the we are starting a new record by setting

a flag <TT><FONT FACE="Courier">$inRecord</FONT></TT> to <TT><FONT FACE="Courier">1</FONT></TT>.

Further processing of the incoming line is halted with a call

to the <TT><FONT FACE="Courier">next()</FONT></TT> function. The

<TT><FONT FACE="Courier">$repeat</FONT></TT> flag is set to <TT><FONT FACE="Courier">0</FONT></TT>

to start a new record and to stop any previous declarations for

any previous records. The fragment of code to start each type

of data is shown here:

<BLOCKQUOTE>

<TT><FONT FACE="Courier">if (/^RECORD/) {<BR>

&nbsp;&nbsp;&nbsp;&nbsp;($rname,$rtype,@rest) = split(' ',$_);

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&amp;startHeaderRecord($rtype);<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&amp;startEncoderFunction($rtype);<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&amp;startDecoderFunction($rtype);<BR>

&nbsp;&nbsp;&nbsp;&nbsp;$inRecord = 1;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;$repeat = 0;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;next;<BR>

&nbsp;&nbsp;&nbsp;}</FONT></TT>

</BLOCKQUOTE>

<P>

The code in <TT><FONT FACE="Courier">while</FONT></TT> also has

to check when the record has ended with the receipt of an <TT><FONT FACE="Courier">ENDREC</FONT></TT>

token. When <TT><FONT FACE="Courier">ENDREC</FONT></TT> is seen,

three functions are called to close up the structure and function

declarations started in the <TT><FONT FACE="Courier">RECORD</FONT></TT>

structure. Because we are no longer within a record, the value

of <TT><FONT FACE="Courier">$inRecord</FONT></TT> is set to <TT><FONT FACE="Courier">0</FONT></TT>

and the next function is called to skip further processing of

this record. The fragment of code to do this cleanup is shown

as this:

<BLOCKQUOTE>

<TT><FONT FACE="Courier">if (/^ENDREC/) {<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&amp;stopHeaderRecord($rtype);<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&amp;closeEncoderFunction();<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&amp;closeDecoderFunction();<BR>

&nbsp;&nbsp;&nbsp;&nbsp;$inRecord = 0;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;next;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;}</FONT></TT>

</BLOCKQUOTE>

<P>

The <TT><FONT FACE="Courier">REPEAT</FONT></TT> block is hit if

the word <TT><FONT FACE="Courier">REPEAT</FONT></TT> is the first

word on a new line. Note that in <TT><FONT FACE="Courier">RECORD</FONT></TT>

and <TT><FONT FACE="Courier">ENDREC</FONT></TT> token recognition,

we looked at the start of a new line, whereas with the <TT><FONT FACE="Courier">REPEAT</FONT></TT>

keyword, we look for the <TT><FONT FACE="Courier">REPEAT</FONT></TT>

token after some white spaces from the start of a new line. The

offsets are derived in two stages. The first stage gets the number

of offsets to work within <TT><FONT FACE="Courier">$count</FONT></TT>

with the split call. The first stage puts the <TT><FONT FACE="Courier">REPEAT</FONT></TT>

line's variables into an array called <TT><FONT FACE="Courier">@allOffsets</FONT></TT>.

The next stage calls the <TT><FONT FACE="Courier">splice()</FONT></TT>

function to extract the subset of items starting from item number

2 in <TT><FONT FACE="Courier">@allOffsets</FONT></TT>. The <TT><FONT FACE="Courier">@offsets</FONT></TT>

array then has the offsets in a record where the rest of the variables

will be repeated in blocks. The next function is called to proceed

to the next line of the input file.

<P>

The fragment of code to do this is shown here:

<BLOCKQUOTE>

<TT><FONT FACE="Courier">if (/^[\s]*REPEAT/) {<BR>

&nbsp;&nbsp;&nbsp;&nbsp;$repeat = 1;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;@allOffsets = split(' ',$_);<BR>

&nbsp;&nbsp;&nbsp;&nbsp;$index = $allOffsets[0];<BR>

&nbsp;&nbsp;&nbsp;&nbsp;$count = $allOffsets[1];<BR>

&nbsp;&nbsp;&nbsp;&nbsp;print &quot;INDEX = $index, COUNT= $count&quot;;

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;@offsets = splice(@allOffsets,2);<BR>

&nbsp;&nbsp;&nbsp;&nbsp;next;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;}</FONT></TT>

</BLOCKQUOTE>

<P>

Finally, the default processing begins for a line. The first thing

to do before attempting to parse a line is to see whether we are

in the middle of a record. Because the input file may contain

free-form text, too, in the future, this is a bit of insurance

to help prevent any variables from being accidentally declared.

<P>

The incoming line is parsed to extract five values into an array.

The input string <TT><FONT FACE="Courier">$_</FONT></TT> is split

on white spaces. The call for this is

<BLOCKQUOTE>

<TT><FONT FACE="Courier">(<I>$vname,$vtype,$from,$to,$fmt</I>)

= split(' ',$_);</FONT></TT>

</BLOCKQUOTE>

<P>

There are two ways to process a variable in our case. One is when

a variable is by itself and another is when the variable is in

a block being repeated. If a block is being repeated, it's easier

to simply declare an array and parse into it. Here's the code

to handle this part:

<BLOCKQUOTE>

<TT><FONT FACE="Courier">if ($inRecord)<BR>

&nbsp;&nbsp;&nbsp;&nbsp;{<BR>

&nbsp;&nbsp;&nbsp;&nbsp;($vname,$vtype,$from,$to,$fmt) = split('

',$_);<BR>

&nbsp;&nbsp;&nbsp;&nbsp;if ($repeat == 0)<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&amp;makeHeaderItem($vname,$vtype,$from,$to);

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&amp;encodeVariable($vname,$vtype,$from,$to,$fmt);

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&amp;decodeVariable($vname,$vtype,$from,$to,$fmt);

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<BR>

&nbsp;&nbsp;&nbsp;&nbsp;else<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&amp;makeArrayedItem($vname,$vtype,$count,$from,$to);

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$offsetLen&nbsp;&nbsp;&nbsp;=

$to - $from + 1;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$offsetCount =

0;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;foreach $x (@offsets)

{<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$offsetFrom&nbsp;&nbsp;=

$x;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$offsetTo&nbsp;&nbsp;&nbsp;&nbsp;=

$offsetLen + $x;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$offsetName

= sprintf &quot;%s[%d]&quot;, $vname,$offsetCount;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;print

&quot;Name = $offsetName, COUNT= $count\n&quot;;<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&amp;encodeVariable($offsetName,$vtype,

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$offsetFrom,$offsetTo,$fmt);

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&amp;decodeVariable($offsetName,$vtype,

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$offsetFrom,$offsetTo,$fmt);

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$offsetCount++;

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}

<BR>

<BR>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} ## end of else

clause.<BR>

&nbsp;&nbsp;&nbsp;&nbsp;}<BR>

} # of while loop.</FONT></TT>

</BLOCKQUOTE>

<P>

The <TT><FONT FACE="Courier">while</FONT></TT> loop continues

to process each line in the input file until all the record definitions

have been completed. After the <TT><FONT FACE="Courier">while</FONT></TT>

loop ends, any terminal processing that be must done is completed

and all open files are closed:

<BLOCKQUOTE>

<TT><FONT FACE="Courier">close (SAFE);<BR>

&amp;closeHeaderFile();<BR>

close (HDRS);<BR>

close (EncD);<BR>

close (DECD);</FONT></TT>

</BLOCKQUOTE>

<P>

When the program terminates, you should have three files in the

directory: <TT><FONT FACE="Courier">P286.h</FONT></TT>, <TT><FONT FACE="Courier">P286enc.c</FONT></TT>,

and <TT><FONT FACE="Courier">P286dec.c</FONT></TT>. The headers

are declared in the <TT><FONT FACE="Courier">P286.h</FONT></TT>

file, encoding functions are declared in the <TT><FONT FACE="Courier">P286enc.c</FONT></TT>
💿 文件大小 1200 K
👤 上传用户 cz6891297
📂 所属分类其他书籍
🏷️ 相关标签

#Unreleased #Perl
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -