📄 arcinfo
字号:
ARC-FILE.INF, created by Keith Petersen, W8SDZ, 21-Sep-86, extractedfrom UNARC.INF by Robert A. Freed.From: Robert A. FreedSubject: Technical Information for ARC filesDate: June 24, 1986Note: In the following discussion, UNARC refers to my CP/M-80 programfor extracting files from MSDOS ARCs. The definitions of the ARC fileformat are based on MSDOS ARC512.EXE.ARCHIVE FILE FORMAT-------------------Component files are stored sequentially within an archive. Each entryis preceded by a 29-byte header, which contains the directoryinformation. There is no wasted space between entries. (This is incontrast to the centralized directory used by Novosielski libraries.Although random access to subfiles within an archive can be noticeablyslower than with libraries, archives do have the advantage of notrequiring pre-allocation of directory space.)Archive entries are normally maintained in sorted name order. Theformat of the 29-byte archive header is as follows:Byte 1: 1A Hex. This marks the start of an archive header. If this byte is not found when expected, UNARC will scan forward in the file (up to 64K bytes) in an attempt to find it (followed by a valid compression version). If a valid header is found in this manner, a warning message is issued and archive file processing continues. Otherwise, the file is assumed to be an invalid archive and processing is aborted. (This is compatible with MS-DOS ARC version 5.12). Note that a special exception is made at the beginning of an archive file, to accomodate "self-unpacking" archives (see below).Byte 2: Compression version, as follows: 0 = end of file marker (remaining bytes not present) 1 = unpacked (obsolete) 2 = unpacked 3 = packed 4 = squeezed (after packing) 5 = crunched (obsolete) 6 = crunched (after packing) (obsolete) 7 = crunched (after packing, using faster hash algorithm) (obsolete) 8 = crunched (after packing, using dynamic LZW variations)Bytes 3-15: ASCII file name, nul-terminated.(All of the following numeric values are stored low-byte first.)Bytes 16-19: Compressed file size in bytes.Bytes 20-21: File date, in 16-bit MS-DOS format: Bits 15:9 = year - 1980 Bits 8:5 = month of year Bits 4:0 = day of month (All zero means no date.)Bytes 22-23: File time, in 16-bit MS-DOS format: Bits 15:11 = hour (24-hour clock) Bits 10:5 = minute Bits 4:0 = second/2 (not displayed by UNARC)Bytes 24-25: Cyclic redundancy check (CRC) value (see below).Bytes 26-29: Original (uncompressed) file length in bytes. (This field is not present for version 1 entries, byte 2 = 1. I.e., in this case the header is only 25 bytes long. Because version 1 files are uncompressed, the value normally found in this field may be obtained from bytes 16-19.)SELF-UNPACKING ARCHIVES-----------------------A "self-unpacking" archive is one which can be renamed to a .COM fileand executed as a program. An example of such a file is the MS-DOSprogram ARC512.COM, which is a standard archive file preceded by athree-byte jump instruction. The first entry in this file is a simple"bootstrap" program in uncompressed form, which loads the subfileARC.EXE (also uncompressed) into memory and passes control to it. Inanticipation of a similar scheme for future distribution of UNARC, theprogram permits up to three bytes to precede the first header in anarchive file (with no error message).CRC COMPUTATION---------------Archive files use a 16-bit cyclic redundancy check (CRC) for errorcontrol. The particular CRC polynomial used is x^16 + x^15 + x^2 + 1,which is commonly known as "CRC-16" and is used in many datatransmission protocols (e.g. DEC DDCMP and IBM BSC), as well as bymost floppy disk controllers. Note that this differs from the CCITTpolynomial (x^16 + x^12 + x^5 + 1), which is used by the XMODEM-CRCprotocol and the public domain CHEK program (although these do notadhere strictly to the CCITT standard). The MS-DOS ARC program doesperform a mathematically sound and accurate CRC calculation. (Wemention this because it contrasts with some unfortunately popularpublic domain programs we have witnessed, which from time immemorialhave based their calculation on an obscure magazine article whichcontained a typographical error!)Additional note (while we are on the subject of CRC's): The validityof using a 16-bit CRC for checking an entire file is somewhatquestionable. Many people quote the statistics related to thesefunctions (e.g. "all two-bit errors, all single burst errors of 16 orfewer bits, 99.997% of all single 17-bit burst errors, etc."), withoutrealizing that these claims are valid only if the total number of bitschecked is less than 32767 (which is why they are used in small-packetdata transmission protocols). I.e., for file sizes in excess of about4K bytes, a 16-bit CRC is not really as good as what is often claimed.This is not to say that it is bad, but there are more reliable methodsavailable (e.g. the 32-bit AUTODIN-II polynomial). (End of lecture!) Bob Freed 62 Miller Road Newton Centre, MA 02159 Telephone (617) 332-3533
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -