library.ltx

来自「UUDeview是一个编码解码器」· LTX 代码 · 共 1,403 行 · 第 1/5 页

LTX
1,403
字号
\NeedsTeXFormat{LaTeX2e}\documentclass{article}\usepackage{psfig}\usepackage{times}\usepackage{a4wide}\author{Frank Pilhofer}\title{The UUDeview Decoding Library}\providecommand{\ush}{\discretionary{-}{}{\_}}\providecommand{\uuversion}{0.5}\providecommand{\uupatch}{20}%% $Id: library.ltx,v 1.28 2004/03/01 23:06:20 fp Exp $%\begin{document}\maketitle\begin{abstract}The UUDeview library is a highly portable set of functions thatprovide facilities for decoding \emph{uuencoded}, \emph{xxencoded},\emph{Base64} and \emph{BinHex}-Encoded files as well as forencoding binary files into all of these representations exceptBinHex. This document describes how the features of encodingand decoding can be integrated into your own applications.The information is intended for developers only, and is not requiredreading material for end users. It is assumed that the reader isfamiliar with the general issue of encoding and decoding and has someexperience with the ``C'' programming language.This document describes version \uuversion{}, patchlevel \uupatch{}of the library.\end{abstract}\section{Introduction}\subsection{Background}The Internet provides us with a fast and reliable means of user-to-usermessage delivery, using private email or newsgroups. Both systems haveoriginally been designed to transport plain-text messages. Over theyears, some methods appeared allowing transport of arbitrary binarydata by ``encoding'' the data into plain-text messages. But afterthese years, there are still certain problems handling the encodeddata, and many recipients have difficulties decoding the messages backinto their original form.It should be the job of the mail delivery agent to handle sending andrend receiving binary data transparently. However, the support of mostapplications is limited, and several incompatibilities among differentsoftware exists.There are three common formats for encoding binary data, called\emph{uuencoding}, \emph{Base64} and \emph{BinHex}. Issues are furthercomplicated by slight variations of the formats, the packaging, andsome broken implementations.Further problems arise with multi-part postings, where the encodingof a huge file has been split up into several individual messages toensure proper transfer over gateways with limited message sizes. Veryfew software is able to properly sort and decode the parts. Evennowadays, many users are at a loss to decode these kinds of messages.This is where the UUDeview Decoding Library steps in.\subsection{The Library}The UUDeview library makes an attempt at decoding nearly allkinds of encoded files. It is supposed to decode multi-part files aswell as many files simultaneously. Part numbers are evaluated, thusmaking it possible to re-arrange parts that aren't in their correctorder.No assumptions are made on the format of the input file. Usually theinput will be an email folder or newsgroup messages. If this is thecase, the information found in header lines is evaluated; but plainencoded files with no surrounding information are also accepted. Theinput may also consist of concatenated parts and files.Decoding files is done in two passes. During the first pass, all inputfiles are scanned. Information is gathered about each chunk of encodeddata. Besides the obvious data about type, position and size of thechunk, some environmental information from the envelope of a mailmessage is also gathered if available.If the scanner finds a properly MIME-formatted message, a proper MIMEparser steps into action. Because MIME messages include preciseinformation about the message's contents, there is seldom doubt aboutits parts.For other, non-MIME messages, the ``Subject'' header line is closelyexamined. Two informations are extracted: the part number (usuallygiven in parentheses) and a unique identifier, which is used to groupseries of postings. If the subject is, for example, ``uudeview.tgz(01/04)'', the scanner concludes that this message is the first in aseries of four, and the indicated filename is an ideal key to identifyeach of the four parts.If the subject is incomplete (no part number) or missing, the scannertries to make the best of the available information, but some of theadvanced features won't work. For example, without any informationabout the part number, it must be assumed that the available parts arein correct order and can't be automatically rearranged.All the information is gathered in a linked list. An application canthen examine the nodes of the list and pick individual items fordecoding. The decoding functions will then visit the parts of a filein correct order and extract the binary data.Because of heavy testing of the routines against real-life dataand many problem reports from users, the functions have become veryrobust, even against input files with few, missing or brokeninformation.\begin{figure}\centering\makebox{\input{structure.tex}}\caption{Integration of the Library}\label{structure}\end{figure}Figure \ref{structure} displays how the library can be integrated intoan application. The library does not assume any capabilities of theoperating system or application language, and can thus be used inalmost any environment. The few necessary interfaces must be providedby the application, which does usually know a great deal more aboutthe target system.The idea of the ``language interface'' is to allow integration of thelibrary services into other programming languages; if the applicationis itself written in C, there's no need for a separate interface, ofcourse. Such an interface currently exists for the Tcl scriptinglanguage; other examples might be Visual Basic, Perl or Delphi.\subsection{Terminology}These are some buzzwords that will be used in the following text.\begin{itemize}\item``Encoded data'' is binary data encoded by one of the methods``uuencoding'', ``xxencoding'', ``Base64'' or ``BinHex''.\item``Message'' refers to both complete email messages and Usenet newspostings, including the complete headers. The format of a message isdescribed in \cite{rfc0822}. A ``message body'' is an email messageor news posting without headers.\itemA ``mail folder'' is a number of concatenated messages.\item``MIME'' refers to the standards set in \cite{rfc1521}.\itemA ``multipart message'' is an entity described by the MIMEstandard. It is a single message divided into one or more individualparts by a unique boundary.\itemA ``partial message'' is also described by the MIME standard. It is amessage with an associated identifier and a part number. Largemessages can be split into multiple partial messages on the sender'sside. The recipient's software groups the partial messages by theiridentifier and composes them back into the original large message.\itemThe term ``partial message'' only refers to \emph{one part} of thelarge message. The original, partialized message is referred to as``multi-part message'' (note the hyphen). To clarify, one part of amulti-part message is a partial message.\end{itemize}\section{Compiling the Library}On Unix systems, configuration and compilation is trivial. Thescript \texttt{configure} automatically checks yoursystem and configures the library appropriately. A subsequent``make'' compiles the modules and builds the final library.On other systems, you must manually create the configuration file andthe Makefile. The configuration file \texttt{config.h} contains a setof preprocessor definitions and macros that describe the availablefeatures on your systems.\subsection{Creating \texttt{config.h} by hand}You can find all available definitions in \texttt{config.h.in}. Thisfile undefines all possible definitions; you can create your ownconfiguration file starting from \texttt{config.h.in} and editing thenecessary differences.Most definitions are either present or absent, only a few need to havea value. If not explicitly mentioned, you can activate a definitionby changing the default \texttt{undef} into \texttt{define}.The following definitions are available:\subsubsection{System Specific}\begin{description}\item[\texttt{SYSTEM\_DOS}]Define for compilation on a \emph{DOS} system. Currently unused.\item[\texttt{SYSTEM\_QUICKWIN}]Define for compilation within a \emph{QuickWin}\footnote{TheMicrosoft compilers offer the \emph{QuickWin} target to allowterminal-oriented programs to run in the Windows environment}program. Currently unused.\item[\texttt{SYSTEM\_WINDLL}]Causes all modules to include \texttt{<windows.h>} before any otherinclude file. Makes \texttt{uulib.c} export a\texttt{Dll\-Entry\-Point} function.\item[\texttt{SYSTEM\_OS2}]Causes all modules to include \texttt{<os2.h>} before any otherinclude file.\end{description}\subsubsection{Compiler Specific}\begin{description}\item[\texttt{PROTOTYPES}]Define if your compiler supports function prototypes.\item[\texttt{UUEXPORT}]This can be a declaration to all functions exported from the decodinglibrary. Frequently needed when compiling into a shared library.\item[\texttt{TOOLEXPORT}]Similar to \texttt{TOOL\-EXPORT}, but for the helper functions fromthe replacement functions in \texttt{fptools.c}.\end{description}\subsubsection{Header Files}There are a number of options that define whether header files areavailable on your system. Don't worry if some of them are not. If aheader file is present, define ``\texttt{HAVE\_}\emph{name-of-header}'':\texttt{HAVE\ush{}ERRNO\_H},\texttt{HAVE\ush{}FCNTL\_H},\texttt{HAVE\ush{}IO\_H},\texttt{HAVE\ush{}MALLOC\_H},\texttt{HAVE\ush{}MEMORY\_H},\texttt{HAVE\ush{}UNISTD\_H} and\texttt{HAVE\ush{}SYS\_TIME\_H}(for \texttt{<sys/time.h>}). Some other include files are neededas well, but there are no macros for mandatory include files.There's also a number of header-specific definitions that do not fitinto the general present-or-not-present scheme.\begin{description}\item[\texttt{STDC\_HEADERS}]Define if your header files conform to \emph{ANSI C}. This requiresthat \texttt{stdarg.h} is present, that \texttt{stdlib.h} isavailable, defining both \texttt{malloc()} and \texttt{free()}, andthat \texttt{string.h} defines the memory functions family(\texttt{memcpy()} etc).\item[\texttt{HAVE\_STDARG\_H}]Implicitly set by \texttt{STDC\ush{}HEADERS}. You only need to definethis one if \texttt{STDC\ush{}HEADERS} is not defined but\texttt{<stdarg.h>} is available.\item[\texttt{HAVE\_VARARGS\_H}]\emph{varargs} can be used as an alternative to \emph{stdarg}. Defineif the above two values are undefined and \texttt{<varargs.h>} isavailable.\item[\texttt{TIME\_WITH\_SYS\_TIME}]Define if \texttt{HAVE\ush{}SYS\ush{}TIME\_H} and if both \texttt{<sys/time.h>}and \texttt{<time.h>} can be included without conflicting definitions.\end{description}\subsubsection{Functions}\begin{description}\item[\texttt{HAVE\_STDIO}]Define if standard I/O (\texttt{stdin}, \texttt{stdout} and\texttt{stderr}) is available.\item[\texttt{HAVE\_GETTIMEOFDAY}]Define if your system provides the \texttt{gettimeofday()} systemcall, which is needed to provide microsecond resolution to thebusy callback. If this function is not available, \texttt{time()} isused.\end{description}\subsubsection{Replacement Functions}The tools library \texttt{fptools} defines many functions that aren'tstandard on all systems. Most of them do not differ in behavior fromtheir originals, but might be slightly slower. But since they areusually only needed in non-speed-critical sections, the replacements

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?