📄 rfc1505.txt
字号:
LINK soft link (Unix) MAC Macintosh file SAM sequential access method (Primos) SEGSAM segmented direct access method (Primos) SEGDAM segmented sequential access method (Primos) TEXT lines of ISO-10646-UTF-1 text ending with CR/LF VAR variable length records (VMS)4.2.4 Created Indicates the creation date of the file. Dates are in the format defined in section 4.3.4.2.5 Modified Indicates the date and time the file was last modified or closed after being open for write.4.2.6 Accessed Indicates the date and time the file was last accessed on the original file system.4.2.7 Owner The owner directive gives the name or numerical ID of the owner or creator of the file.Costanzo, Robinson & Ullmann [Page 15]RFC 1505 Encoding Header Field August 19934.2.8 Group The group directive gives the name(s) or numerical IDs of the group or groups to which the file belongs.4.2.9 ACL This directive specifies the access control list attribute of an object (the ACL attribute may occur more than once within an object). The list consist of a series of pairs of IDs and access codes in the format: user-ID:access-list There are four reserved IDs: $OWNER the owner or creator $GROUP a member of the group or groups $SYSTEM a system administrator $REST everyone else The access list is zero or more single letters: A add (create file) D delete L list (read directory) P change protection R read U use W write X execute * all possible access4.2.10 Password The password attribute gives the access password for this object. Since the content of the object follows (being the raison d'etre of the encoding), the appearance of the password in plain text is not considered a security problem. If the password is actually set by the decoder on a created object, the security (or lack) is the responsibility of the application domain controlling the decoder as is true of ACL and other protections.4.2.11 Block The block attribute gives the block size of the file as a decimal number of bytes.Costanzo, Robinson & Ullmann [Page 16]RFC 1505 Encoding Header Field August 19934.2.12 Record The record attribute gives the record size of the file as a decimal number of bytes.4.2.13 Application This specifies the application that the file was created with or belongs to. This is of particular interest for Macintosh files.4.3 Date Field Various attributes have a date and time subsequent to and associated with them.4.3.1 Syntax The syntax of the date field is a combination of date, time, and timezone: DD Mon YYYY HH:MM:SS.FFFFFF [+-]HHMMSS Date := DD Mon YYYY 1 or 2 Digits " " 3 Alpha " " 4 Digits DD := Day e.g. "08", " 8", "8" Mon := Month "Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" | "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec" YYYY := Year Time := HH:MM:SS.FFFFFF 2 Digits ":" 2 Digits [ ":" 2 Digits ["." 1 to 6 Digits ] ] e.g. 00:00:00, 23:59:59.999999 HH := Hours 00 to 23 MM := Minutes 00 to 59 SS := Seconds 00 to 60 (60 only during a leap second) FFFFF:= Fraction Zone := [+-]HHMMSS "+" | "-" 2 Digits [ 2 Digits [ 2 Digits ] ] HH := Local Hour Offset MM := Local Minutes Offset SS := Local Seconds Offset4.3.2 Semantics The date information is that which the file system has stored in regard to the file system object. Date information is stored differently and with varying degrees of precision by different computer file systems. An encoder must include as much date information as it has available concerning the file system object. ACostanzo, Robinson & Ullmann [Page 17]RFC 1505 Encoding Header Field August 1993 decoder which receives an object encoded with a date field containing greater precision than its own must disregard the excessive information. Zone is Co-ordinated Universal Time "UTC" (formerly called "Greenwich Mean Time"). The field specifies the time zone of the file system object as an offset from Universal Time. It is expressed as a signed [+-] two, four or six digit number. A file that was created April 15, 1993 at 8:05 p.m. in Roselle Park, New Jersey, U.S.A. might have a date field which looks like: 15 Apr 1993 20:05:22.12 -05005. LZJU90: Compressed Encoding LZJU90 is an encoding for a binary or text object to be sent in an Internet mail message. The encoding provides both compression and representation in a text format that will successfully survive transmission through the many different mailers and gateways that comprise the Internet and connected mail networks.5.1 Overview The encoding first compresses the binary object, using a modified LZ77 algorithm, called LZJU90. It then encodes each 6 bits of the output of the compression as a text character, using a character set chosen to survive any translations between codes, such as ASCII to EBCDIC. The 64 six-bit strings 000000 through 111111 are represented by the characters "+", "-", "0" to "9", "A" to "Z", and "a" to "z". The output text begins with a line identifying the encoding. This is for visual reference only, the "Encoding:" field in the header identifies the section to the user program. It also names the object that was encoded, usually by a file name. The format of this line is: * LZJU90 <name> where <name> is optional. For example: * LZJU90 vmunix This is followed by the compressed and encoded data, broken into lines where convenient. It is recommended that lines be broken every 78 characters to survive mailers than incorrectly restrict line length. The decoder must accept lines with 1 to 1000 characters on each line. After this, there is one final line that gives the number of bytes in the original data and a CRC of the original data. ThisCostanzo, Robinson & Ullmann [Page 18]RFC 1505 Encoding Header Field August 1993 should match the byte count and CRC found during decompression. This line has the format: * <count> <CRC> where <count> is a decimal number, and CRC is 8 hexadecimal digits. For example: * 4128076 5AC2D50E The count used in the Encoding: field in the message header is the total number of lines, including the start and end lines that begin with *. A complete example is given in section 5.3.2.5.2 Specification of the LZJU90 compression The Lempel-Ziv-Storer-Szymanski model of mixing pointers and literal characters is used in the compression algorithm. Repeat occurrences of strings of octets are replaced by pointers to the earlier occurrence. The data compression is defined by the decoding algorithm. Any encoder that emits symbols which cause the decoder to produce the original input is defined to be valid. There are many possible strategies for the maximal-string matching that the encoder does, section 5.3.1 gives the code for one such algorithm. Regardless of which algorithm is used, and what tradeoffs are made between compression ratio and execution speed or space, the result can always be decoded by the simple decoder. The compressed data consists of a mixture of unencoded literal characters and copy pointers which point to an earlier occurrence of the string to be encoded. Compressed data contains two types of codewords: LITERAL pass the literal directly to the uncompressed output. COPY length, offset go back offset characters in the output and copy length characters forward to the current position. To distinguish between codewords, the copy length is used. A copy length of zero indicates that the following codeword is a literal codeword. A copy length greater than zero indicates that theCostanzo, Robinson & Ullmann [Page 19]RFC 1505 Encoding Header Field August 1993 following codeword is a copy codeword. To improve copy length encoding, a threshold value of 2 has been subtracted from the original copy length for copy codewords, because the minimum copy length is 3 in this compression scheme. The maximum offset value is set at 32255. Larger offsets offer extremely low improvements in compression (less than 1 percent, typically). No special encoding is done on the LITERAL characters. However, unary encoding is used for the copy length and copy offset values to improve compression. A start-step-stop unary code is used. A (start, step, stop) unary code of the integers is defined as follows: The Nth codeword has N ones followed by a zero followed by a field of size START + (N * STEP). If the field width is equal to STOP then the preceding zero can be omitted. The integers are laid out sequentially through these codewords. For example, (0, 1, 4) would look like: Codeword Range 0 0 10x 1-2 110xx 3-6 1110xxx 7-14 1111xxxx 15-30 Following are the actual values used for copy length and copy offset: The copy length is encoded with a (0, 1, 7) code leading to a maximum copy length of 256 by including the THRESHOLD value of 2. Codeword Range 0 0 10x 3-4 110xx 5-8 1110xxx 9-16 11110xxxx 17-32 111110xxxxx 33-64 1111110xxxxxx 65-128 1111111xxxxxxx 129-256 The copy offset is encoded with a (9, 1, 14) code leading to a maximum copy offset of 32255. Offset 0 is reserved as an end of compressed data flag.Costanzo, Robinson & Ullmann [Page 20]RFC 1505 Encoding Header Field August 1993 Codeword Range 0xxxxxxxxx 0-511 10xxxxxxxxxx 512-1535 110xxxxxxxxxxx 1536-3583 1110xxxxxxxxxxxx 3485-7679 11110xxxxxxxxxxxxx 7680-15871 11111xxxxxxxxxxxxxx 15872-32255 The 0 has been chosen to signal the start of the field for ease of encoding. (The bit generator can simply encode one more bit than is significant in the binary representation of the excess.) The stop values are useful in the encoding to prevent out of range values for the lengths and offsets, as well as shortening some codes by one bit. The worst case compression using this scheme is a 1/8 increase in size of the encoded data. (One zero bit followed by 8 character bits). After the character encoding, the worst case ratio is 3/2 to the original data. The minimum copy length of 3 has been chosen because the worst case copy length and offset is 3 bits (3) and 19 bits (32255) for a total of 22 bits to encode a 3 character string (24 bits).5.3 The Decoder As mentioned previously, the compression is defined by the decoder. Any encoder that produced output that is correctly decoded is by definition correct. The following is an implementation of the decoder, written more for clarity and as much portability as possible, rather than for maximum speed. When optimized for a specific environment, it will run significantly faster. /* LZJU 90 Decoding program */ /* Written By Robert Jung and Robert Ullmann, 1990 and 1991. */ /* This code is NOT COPYRIGHT, not protected. It is in the true Public Domain. */ #include <stdio.h> #include <string.h>Costanzo, Robinson & Ullmann [Page 21]RFC 1505 Encoding Header Field August 1993 typedef unsigned char uchar; typedef unsigned int uint; #define N 32255 #define THRESHOLD 3 #define STRTP 9 #define STEPP 1 #define STOPP 14 #define STRTL 0 #define STEPL 1 #define STOPL 7 static FILE *in; static FILE *out; static int getbuf; static int getlen; static long in_count; static long out_count; static long crc; static long crctable[256]; static uchar xxcodes[] = "+-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ\ abcdefghijklmnopqrstuvwxyz"; static uchar ddcodes[256]; static uchar text[N]; #define CRCPOLY 0xEDB88320
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -