📄 rfc803.txt
字号:
RFC 803
Dacom 450/500 Facsimile Data Transcoding
A. Agarwal, M. J. O'Connor and D. L. Mills
2 November 1981
1. Introduction
As part of our effort in support of the DARPA Internet Program,
software modules to encode and decode facsimile data for the Dacom 450
and 500 models Computerfax facsimile machines have been constructed.
The purpose of these modules is to map the data representations used by
these machines to and from bit-map and run-length representations in
programs for editing, transmission and archiving facsimile images. The
modules are written in the PDP-11 MACRO-11 assembly language and can be
incorporated into programs for, among others, the RT-11 operating system
and the DCNET BOS or VOS operating systems.
The first part of this report describes in detail the Dacom 450
data compression algorithm and is an update and correction to an earlier
memorandum [2]. Following this, the encoding and decoding algorithms
are described along with the supporting programs and file formats.
Reference [3] describes another implementation of the decoding
algorithm. Grateful acknowledgment is made to E. A. Poe of Rapicom
for his assistance in this effort.
The second part of this report describes briefly the Dacom 500 data
compression algorithm as used by the INTELPOST electronic-mail network
under development by the US Postal Service and several foreign
administrations. These machines conform to the CCITT T.4 Draft
Recommendation, described in [5]. Supporting programs and file formats
are described.
2. Dacom 450 Data Compression Principles
The encoding algorithm for the Dacom 450 processes lines scanned by
the machine to produce a two-dimensional run-length code described by
Weber [1]; however, this article contains a number of errors and
omissions, many of which were discovered only after considerable
analysis and experimentation [2,3]. The machine operates over a
coordinate space of l726 by approximately 2200 pels when in
high-resolution (detail) mode. In normal (quality) mode the vertical
resolution is halved, so that about 1100 lines are transmitted, while in
express mode about 733 lines are transmitted (missed lines are filled in
on playback by replicating previous lines).
Data are encoded two rows at a time using a two-dimensional
run-length code. Each row-pair is scanned left-to-right and the
line-pairs themselves processed top-to-bottom of the document. Figure 1
shows how the pels are represented.
Dacom 450/500 Facsimile Data Transcoding PAGE 2
| | |
----+----------+----------+----
... | x(1,j) | x(1,j+1) | ...
----+----------+----------+----
... | x(2,j) | x(2,j+1) | ...
----+----------+----------+----
| | |
Direction of scan ->
Figure 1. Data Representation
For each j the vector (x(1,j),x(2,j)) represents the contents of
the jth column, where x(i,j) can take on values of zero (white) or one
(black). Each of the four possible vectors ranging over these values
will be called a state (Dacom calls these "modes") with the succession
of transitions between these states determined by the picture content of
the particular line-pair. Scanning of the line-pairs follows one after
the other with no special end-of-line code in the data itself. For the
purpose of later discussion and comparison with the published data, the
following conventions will be used (note: the pels read top-bottom):
Pels Vector State
---------------------
W-W (0,0) 0
B-W (1,0) 1
W-B (0,1) 2
B-B (1,1) 3
The algorithm used by Dacom to generate the transmitted data as the
columns are scanned can be described as the non-deterministic
finite-state automaton (nfsa) shown in Figure 2. Conceptually, the nfsa
starts at the beginning of a page in a designated state and at a point
just after scanning the jth column in the jth state. It then scans the
(j + 1)th column and enters that state while emitting the string of bits
shown in the figure.
In the states corresponding to W-W (0) and B-B (3) a special
run-length encoding techniques is used. There are two state variables
associated with each of these two states, one variable used as a
run-length counter and the other the field length (in bits) of this
counter. Upon each entry to either of these two states the counter is
initialized at zero and counts up for every additional column of the
same state. At the end of the run the value of this counter is
transmitted extending with high-order zeros, if necessary, to fill the
field length specified. If, however, the counter equals 2**n - 1, where
n is the field length, then a sequence of n one-bits is emitted and the
counter re-initialized at zero with a field length of n + 1. Thus, if
n = 3, a run length of three is transmitted as {010} and a run length of
seven as {110}, while a run length of eight as two words, {111} followed
by {0000}. The field-length variables are maintained separately for
both the W-W and B-B states, and at each re-entry to either of these
states the previous values are used.
Dacom 450/500 Facsimile Data Transcoding PAGE 3
0100
.--------------------->----------------------------------.
| |
| .-----------------<------------------------------. |
| | 1 | |
| V | |
.--------------. .---------------. | |
| | | | | |
| | 010 | | | |
.->| 1 |-------------------->| 2 |->. | |
| | | | | | | |
0| | B-W | 101 | W-B | |1 | |
\<-| |<--------------------| |<-' | |
| | | | | |
| | .---->| | | |
\--------------' | \---------------' | |
| A | | | A | |
| | .--------->------' | | | | |
| | | 1 | | | | |
| | | | | | A V
| | | | | | | |
0111| |1 | | 1000| |1 | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | 1011 | | | | |
| | | .-------<----------' | | | |
V | | | V | | |
.--------------. | .---------------. | |
| |<--' | | | |
| | 0 | | | |
| 3 |<--------------------| 0 |-----' |
| | | | |
| B-B | | W-W | |
| |-------------------->| |<--------'
| | 0 | |
| | | |
\--------------' \---------------'
| A | A
| | | |
\----' \----'
run run
Figure 2. NFSA Model of Encoding
Dacom 450/500 Facsimile Data Transcoding PAGE 4
Field-length values are constrained not to exceed seven, so that
runs exceeding l27 with n = 7 will be encoded as a separate 7-bit word
of one-bits for each run of l27 except the last, which must always
contain at least one zero-bit. The field length n is decreased by one
under the following circumstances: the current run has been encoded as a
single n-bit field, and for n in the range four through seven the two
high-order bits are zero or for n equal to three the single high-order
bit is zero. The field length is not allowed to be reduced below two
bits.
The encoding algorithm starts in state 0 with both field lengths
set to 7.
2.1. Dacom 450 Decoding Algorithm
For reasons of speed and simplicity it is desirable that the Dacom
450 decoding algorithm be modeled on the basis of a deterministic
finite-state automaton (dfsa). Using straightforward formal procedures,
the dfsa of Figure 3 can be constructed. This machine makes one state
transition for every bit, except for the W-W (0) and B-B (3) states,
which must be treated specially in any case. The states are labeled in
such a way as to correspond to those of Figure 2 for states numbered
from zero to three.
The decoded output symbols, in this case the columns corresponding
to each of the states, are represented by the states themselves. Upon
entry to state B-W (1) or W-B (2) a run-length counter is initialized to
one. Each traversal of a loop back to the same state increments this
counter and, upon exit to any other state, the value of this counter
represents the number of columns to be produced. Upon entry to state
W-W (0) or B-B (3) the run-length counter is initialized to zero and the
associated field-length state variable n established. For each
successive n bits of all-ones, the counter is increased by 2**n - 1 and
then n itself increased by one, but not above seven. If the next n bits
are not all ones, then the counter is increased by the value represented
by the n-bit field plus one. Finally, if upon entry to either state the
next n bits are not all ones, n is decreased by one according to the
rule mentioned in the preceding section.
Dacom 450/500 Facsimile Data Transcoding PAGE 5
.-----------. .-----------.
.-----| | | |-----.
| | 9 | | 6 | |
| .-| |<--. .-->| |-. |
| | \-----------' \ / \-----------' | |
1| 0| \ / |1 |0
| | .->Error \ / Error<-. | |
| | 0| \ / |1 | |
| | .-----------. \ / .-----------. | |
| 1 | | | \ / | | | 0 |
| .---| 7 | \ | 10 |---. |
| | | | | / \ | | | | |
| | | \-----------' / \ \-----------' | | |
| | | A / \ A | | |
| | | | / \ | | | |
| | | 1| / \ |0 | | |
| | | .-----------. 0 / \ 1 .-----------. | | |
| | | | |---' \---| | | | |
| | | | 5 | | 8 | | | |
| | | | | | | | | |
| | | \-----------' \-----------' | | |
| | | A A | | |
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -