📄 ch31.htm
字号:
24 41<BR>
21 semimajorAxis double
42 53 12.3<BR>
22 conversionFactor double 66 77 12.8
<BR>
23 inverseFlattening double 66 77 12.7
<BR>
24 ENDREC<BR>
25<BR>
26 RECORD E3100<BR>
27 velprop double 6 12
7.2<BR>
28 REPEAT 5 13
26 39 52 65<BR>
29 srcNdx int
13 15<BR>
30 dstNdx int
16 18<BR>
31 slant double
19 25 7.2<BR>
32 ENDREC</FONT></TT>
</BLOCKQUOTE>
<HR>
<H2><A NAME="ParsingRecords"><B><FONT SIZE=5 COLOR=#FF0000>Parsing
Records</FONT></B></A></H2>
<P>
Now that we have an input file, let's tackle parsing the records
within this file. The most likely way to tackle this problem is
to generate three files from the input file: one header, one file
with all source code for the decoder, and one source file for
the encoder.
<P>
The pseudocode looks something like this:
<BLOCKQUOTE>
<TT><I><FONT FACE="Courier">open file for input<BR>
open files for output<BR>
while (more records)<BR>
</FONT></I><FONT FACE="Courier"> <I>if
recognized start of record<BR>
start structure definitions<BR>
start encoder function preamble<BR>
start decoder function preamble<BR>
</I> <I>if recognized
end of record<BR>
terminate structure definitions<BR>
terminate encoder function preamble<BR>
terminate decoder function preamble<BR>
if within record<BR>
generate structure variable definitions
<BR>
generate encoder function parsing for
variable<BR>
generate decoder function parsing for
variable<BR>
close all files</I></FONT></TT>
</BLOCKQUOTE>
<P>
Perl has the capability to have more than one file open at once.
By running the input file through one parser, you can generate
three files simultaneously. All you do is send the output to its
respective file. In this case, three files are opened for output:
<TT><FONT FACE="Courier">HDRS</FONT></TT> for the header declarations,
<TT><FONT FACE="Courier">EncD</FONT></TT> for the encoding output,
and <TT><FONT FACE="Courier">DECD</FONT></TT> for the decoding
output. The <TT><FONT FACE="Courier">SAFE</FONT></TT> handle is
used to read in the input records from the <TT><FONT FACE="Courier">P286hdrs</FONT></TT>
file. The lines to open the file are
<BLOCKQUOTE>
<TT><FONT FACE="Courier">open (SAFE, "P286hdrs") ||
die "Cannot open Input file $!\n";<BR>
open (HDRS, ">P286.h") || die "Cannot
open Header $!\n";<BR>
open (EncD, ">P286enc.c") || die "Cannot
open Encoder $!\n";<BR>
open (DECD, ">P286dec.c") || die "Cannot
open Decoder $!\n";</FONT></TT>
</BLOCKQUOTE>
<P>
After the files are opened, some preamble stuff is required for
each source and header file. The calls to the these functions
provide the initialization. The contents of each file are destroyed
when the files are opened, so you have to initialize each file:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">&startHeaderFile();<BR>
&startEncoderFile();<BR>
&startDecoderFile();</FONT></TT>
</BLOCKQUOTE>
<P>
Then, a <TT><FONT FACE="Courier">while</FONT></TT> loop simply
reads in all the input, one record at a time.
<P>
After chopping off the terminating newline, the incoming line
is examined to see whether there are any comments or if it's a
blank line. If either case is true, the line is discarded. Look
at the following code:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">while (<SAFE>) {<BR>
chop($_);<BR>
if (/^#/) { next; }<BR>
if (/^\s*$/) { next; }</FONT></TT>
</BLOCKQUOTE>
<P>
If the line appears to be non-empty, it's split into the <TT><FONT FACE="Courier">@names</FONT></TT>
array and examined for the tokens <TT><FONT FACE="Courier">RECORD</FONT></TT>,
<TT><FONT FACE="Courier">ENDREC</FONT></TT>, and <TT><FONT FACE="Courier">REPEAT</FONT></TT>.
The default case is to process variable types and generate either
structure variable declarations or code for encoding and decoding
their values.
<P>
If it's a <TT><FONT FACE="Courier">RECORD</FONT></TT> token, a
new declaration is started for a structure. The name of the record
is in <TT><FONT FACE="Courier">$rname</FONT></TT>, with the <TT><FONT FACE="Courier">$rtype</FONT></TT>
as the type of record. Note how the <TT><FONT FACE="Courier">$_</FONT></TT>
is used twice when splitting the record. The value of <TT><FONT FACE="Courier">$_</FONT></TT>
is not modified with a call to the <TT><FONT FACE="Courier">split()</FONT></TT>
function nor is any function called that will modify the value
of <TT><FONT FACE="Courier">$_</FONT></TT>. The three functions,
<TT><FONT FACE="Courier">&startHeaderRecord($rtype)</FONT></TT>,
<TT><FONT FACE="Courier">&startEncoderFunction($rtype)</FONT></TT>,
and <TT><FONT FACE="Courier">&startDecoderFunction($rtype)</FONT></TT>
take the P286 record type and generate a header declaration, an
encoder function preamble, and a decoder function preamble. We
also mark the fact the we are starting a new record by setting
a flag <TT><FONT FACE="Courier">$inRecord</FONT></TT> to <TT><FONT FACE="Courier">1</FONT></TT>.
Further processing of the incoming line is halted with a call
to the <TT><FONT FACE="Courier">next()</FONT></TT> function. The
<TT><FONT FACE="Courier">$repeat</FONT></TT> flag is set to <TT><FONT FACE="Courier">0</FONT></TT>
to start a new record and to stop any previous declarations for
any previous records. The fragment of code to start each type
of data is shown here:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">if (/^RECORD/) {<BR>
($rname,$rtype,@rest) = split(' ',$_);
<BR>
&startHeaderRecord($rtype);<BR>
&startEncoderFunction($rtype);<BR>
&startDecoderFunction($rtype);<BR>
$inRecord = 1;<BR>
$repeat = 0;<BR>
next;<BR>
}</FONT></TT>
</BLOCKQUOTE>
<P>
The code in <TT><FONT FACE="Courier">while</FONT></TT> also has
to check when the record has ended with the receipt of an <TT><FONT FACE="Courier">ENDREC</FONT></TT>
token. When <TT><FONT FACE="Courier">ENDREC</FONT></TT> is seen,
three functions are called to close up the structure and function
declarations started in the <TT><FONT FACE="Courier">RECORD</FONT></TT>
structure. Because we are no longer within a record, the value
of <TT><FONT FACE="Courier">$inRecord</FONT></TT> is set to <TT><FONT FACE="Courier">0</FONT></TT>
and the next function is called to skip further processing of
this record. The fragment of code to do this cleanup is shown
as this:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">if (/^ENDREC/) {<BR>
&stopHeaderRecord($rtype);<BR>
&closeEncoderFunction();<BR>
&closeDecoderFunction();<BR>
$inRecord = 0;<BR>
next;<BR>
}</FONT></TT>
</BLOCKQUOTE>
<P>
The <TT><FONT FACE="Courier">REPEAT</FONT></TT> block is hit if
the word <TT><FONT FACE="Courier">REPEAT</FONT></TT> is the first
word on a new line. Note that in <TT><FONT FACE="Courier">RECORD</FONT></TT>
and <TT><FONT FACE="Courier">ENDREC</FONT></TT> token recognition,
we looked at the start of a new line, whereas with the <TT><FONT FACE="Courier">REPEAT</FONT></TT>
keyword, we look for the <TT><FONT FACE="Courier">REPEAT</FONT></TT>
token after some white spaces from the start of a new line. The
offsets are derived in two stages. The first stage gets the number
of offsets to work within <TT><FONT FACE="Courier">$count</FONT></TT>
with the split call. The first stage puts the <TT><FONT FACE="Courier">REPEAT</FONT></TT>
line's variables into an array called <TT><FONT FACE="Courier">@allOffsets</FONT></TT>.
The next stage calls the <TT><FONT FACE="Courier">splice()</FONT></TT>
function to extract the subset of items starting from item number
2 in <TT><FONT FACE="Courier">@allOffsets</FONT></TT>. The <TT><FONT FACE="Courier">@offsets</FONT></TT>
array then has the offsets in a record where the rest of the variables
will be repeated in blocks. The next function is called to proceed
to the next line of the input file.
<P>
The fragment of code to do this is shown here:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">if (/^[\s]*REPEAT/) {<BR>
$repeat = 1;<BR>
@allOffsets = split(' ',$_);<BR>
$index = $allOffsets[0];<BR>
$count = $allOffsets[1];<BR>
print "INDEX = $index, COUNT= $count";
<BR>
@offsets = splice(@allOffsets,2);<BR>
next;<BR>
}</FONT></TT>
</BLOCKQUOTE>
<P>
Finally, the default processing begins for a line. The first thing
to do before attempting to parse a line is to see whether we are
in the middle of a record. Because the input file may contain
free-form text, too, in the future, this is a bit of insurance
to help prevent any variables from being accidentally declared.
<P>
The incoming line is parsed to extract five values into an array.
The input string <TT><FONT FACE="Courier">$_</FONT></TT> is split
on white spaces. The call for this is
<BLOCKQUOTE>
<TT><FONT FACE="Courier">(<I>$vname,$vtype,$from,$to,$fmt</I>)
= split(' ',$_);</FONT></TT>
</BLOCKQUOTE>
<P>
There are two ways to process a variable in our case. One is when
a variable is by itself and another is when the variable is in
a block being repeated. If a block is being repeated, it's easier
to simply declare an array and parse into it. Here's the code
to handle this part:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">if ($inRecord)<BR>
{<BR>
($vname,$vtype,$from,$to,$fmt) = split('
',$_);<BR>
if ($repeat == 0)<BR>
{<BR>
&makeHeaderItem($vname,$vtype,$from,$to);
<BR>
&encodeVariable($vname,$vtype,$from,$to,$fmt);
<BR>
&decodeVariable($vname,$vtype,$from,$to,$fmt);
<BR>
}<BR>
else<BR>
{<BR>
&makeArrayedItem($vname,$vtype,$count,$from,$to);
<BR>
$offsetLen =
$to - $from + 1;<BR>
$offsetCount =
0;<BR>
foreach $x (@offsets)
{<BR>
$offsetFrom =
$x;<BR>
$offsetTo =
$offsetLen + $x;<BR>
$offsetName
= sprintf "%s[%d]", $vname,$offsetCount;<BR>
print
"Name = $offsetName, COUNT= $count\n";<BR>
&encodeVariable($offsetName,$vtype,
<BR>
$offsetFrom,$offsetTo,$fmt);
<BR>
&decodeVariable($offsetName,$vtype,
<BR>
$offsetFrom,$offsetTo,$fmt);
<BR>
$offsetCount++;
<BR>
}
<BR>
<BR>
} ## end of else
clause.<BR>
}<BR>
} # of while loop.</FONT></TT>
</BLOCKQUOTE>
<P>
The <TT><FONT FACE="Courier">while</FONT></TT> loop continues
to process each line in the input file until all the record definitions
have been completed. After the <TT><FONT FACE="Courier">while</FONT></TT>
loop ends, any terminal processing that be must done is completed
and all open files are closed:
<BLOCKQUOTE>
<TT><FONT FACE="Courier">close (SAFE);<BR>
&closeHeaderFile();<BR>
close (HDRS);<BR>
close (EncD);<BR>
close (DECD);</FONT></TT>
</BLOCKQUOTE>
<P>
When the program terminates, you should have three files in the
directory: <TT><FONT FACE="Courier">P286.h</FONT></TT>, <TT><FONT FACE="Courier">P286enc.c</FONT></TT>,
and <TT><FONT FACE="Courier">P286dec.c</FONT></TT>. The headers
are declared in the <TT><FONT FACE="Courier">P286.h</FONT></TT>
file, encoding functions are declared in the <TT><FONT FACE="Courier">P286enc.c</FONT></TT>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -