⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 rfc3492.txt

📁 bind 9.3结合mysql数据库
💻 TXT
📖 第 1 页 / 共 5 页
字号:
      predict the minimum number of digits needed to represent the next      delta:         while delta > ((base - tmin) * tmax) div 2         do let delta = delta div (base - tmin)   4. The bias is set:         let bias =           (base * the number of divisions performed in step 3) +           (((base - tmin + 1) * delta) div (delta + skew))      The motivation for this procedure is that the current delta      provides a hint about the likely size of the next delta, and so      t(j) is set to tmax for the more significant digits starting with      the one expected to be last, tmin for the less significant digits      up through the one expected to be third-last, and somewhere      between tmin and tmax for the digit expected to be second-lastCostello                    Standards Track                     [Page 7]RFC 3492                     IDNA Punycode                    March 2003      (balancing the hope of the expected-last digit being unnecessary      against the danger of it being insufficient).4. Bootstring parameters   Given a set of basic code points, one needs to be designated as the   delimiter.  The base cannot be greater than the number of   distinguishable basic code points remaining.  The digit-values in the   range 0 through base-1 need to be associated with distinct non-   delimiter basic code points.  In some cases multiple code points need   to have the same digit-value; for example, uppercase and lowercase   versions of the same letter need to be equivalent if basic strings   are case-insensitive.   The initial value of n cannot be greater than the minimum non-basic   code point that could appear in extended strings.   The remaining five parameters (tmin, tmax, skew, damp, and the   initial value of bias) need to satisfy the following constraints:      0 <= tmin <= tmax <= base-1      skew >= 1      damp >= 2      initial_bias mod base <= base - tmin   Provided the constraints are satisfied, these five parameters affect   efficiency but not correctness.  They are best chosen empirically.   If support for mixed-case annotation is desired (see appendix A),   make sure that the code points corresponding to 0 through tmax-1 all   have both uppercase and lowercase forms.5. Parameter values for Punycode   Punycode uses the following Bootstring parameter values:      base         = 36      tmin         = 1      tmax         = 26      skew         = 38      damp         = 700      initial_bias = 72      initial_n    = 128 = 0x80   Although the only restriction Punycode imposes on the input integers   is that they be nonnegative, these parameters are especially designed   to work well with Unicode [UNICODE] code points, which are integers   in the range 0..10FFFF (but not D800..DFFF, which are reserved forCostello                    Standards Track                     [Page 8]RFC 3492                     IDNA Punycode                    March 2003   use by the UTF-16 encoding of Unicode).  The basic code points are   the ASCII [ASCII] code points (0..7F), of which U+002D (-) is the   delimiter, and some of the others have digit-values as follows:      code points    digit-values      ------------   ----------------------      41..5A (A-Z) =  0 to 25, respectively      61..7A (a-z) =  0 to 25, respectively      30..39 (0-9) = 26 to 35, respectively   Using hyphen-minus as the delimiter implies that the encoded string   can end with a hyphen-minus only if the Unicode string consists   entirely of basic code points, but IDNA forbids such strings from   being encoded.  The encoded string can begin with a hyphen-minus, but   IDNA prepends a prefix.  Therefore IDNA using Punycode conforms to   the RFC 952 rule that host name labels neither begin nor end with a   hyphen-minus [RFC952].   A decoder MUST recognize the letters in both uppercase and lowercase   forms (including mixtures of both forms).  An encoder SHOULD output   only uppercase forms or only lowercase forms, unless it uses mixed-   case annotation (see appendix A).   Presumably most users will not manually write or type encoded strings   (as opposed to cutting and pasting them), but those who do will need   to be alert to the potential visual ambiguity between the following   sets of characters:      G 6      I l 1      O 0      S 5      U V      Z 2   Such ambiguities are usually resolved by context, but in a Punycode   encoded string there is no context apparent to humans.6. Bootstring algorithms   Some parts of the pseudocode can be omitted if the parameters satisfy   certain conditions (for which Punycode qualifies).  These parts are   enclosed in {braces}, and notes immediately following the pseudocode   explain the conditions under which they can be omitted.Costello                    Standards Track                     [Page 9]RFC 3492                     IDNA Punycode                    March 2003   Formally, code points are integers, and hence the pseudocode assumes   that arithmetic operations can be performed directly on code points.   In some programming languages, explicit conversion between code   points and integers might be necessary.6.1 Bias adaptation function   function adapt(delta,numpoints,firsttime):     if firsttime then let delta = delta div damp     else let delta = delta div 2     let delta = delta + (delta div numpoints)     let k = 0     while delta > ((base - tmin) * tmax) div 2 do begin       let delta = delta div (base - tmin)       let k = k + base     end     return k + (((base - tmin + 1) * delta) div (delta + skew))   It does not matter whether the modifications to delta and k inside   adapt() affect variables of the same name inside the   encoding/decoding procedures, because after calling adapt() the   caller does not read those variables before overwriting them.Costello                    Standards Track                    [Page 10]RFC 3492                     IDNA Punycode                    March 20036.2 Decoding procedure   let n = initial_n   let i = 0   let bias = initial_bias   let output = an empty string indexed from 0   consume all code points before the last delimiter (if there is one)     and copy them to output, fail on any non-basic code point   if more than zero code points were consumed then consume one more     (which will be the last delimiter)   while the input is not exhausted do begin     let oldi = i     let w = 1     for k = base to infinity in steps of base do begin       consume a code point, or fail if there was none to consume       let digit = the code point's digit-value, fail if it has none       let i = i + digit * w, fail on overflow       let t = tmin if k <= bias {+ tmin}, or               tmax if k >= bias + tmax, or k - bias otherwise       if digit < t then break       let w = w * (base - t), fail on overflow     end     let bias = adapt(i - oldi, length(output) + 1, test oldi is 0?)     let n = n + i div (length(output) + 1), fail on overflow     let i = i mod (length(output) + 1)     {if n is a basic code point then fail}     insert n into output at position i     increment i   end   The full statement enclosed in braces (checking whether n is a basic   code point) can be omitted if initial_n exceeds all basic code points   (which is true for Punycode), because n is never less than initial_n.   In the assignment of t, where t is clamped to the range tmin through   tmax, "+ tmin" can always be omitted.  This makes the clamping   calculation incorrect when bias < k < bias + tmin, but that cannot   happen because of the way bias is computed and because of the   constraints on the parameters.   Because the decoder state can only advance monotonically, and there   is only one representation of any delta, there is therefore only one   encoded string that can represent a given sequence of integers.  The   only error conditions are invalid code points, unexpected end-of-   input, overflow, and basic code points encoded using deltas instead   of appearing literally.  If the decoder fails on these errors as   shown above, then it cannot produce the same output for two distinct   inputs.  Without this property it would have been necessary to re-Costello                    Standards Track                    [Page 11]RFC 3492                     IDNA Punycode                    March 2003   encode the output and verify that it matches the input in order to   guarantee the uniqueness of the encoding.6.3 Encoding procedure   let n = initial_n   let delta = 0   let bias = initial_bias   let h = b = the number of basic code points in the input   copy them to the output in order, followed by a delimiter if b > 0   {if the input contains a non-basic code point < n then fail}   while h < length(input) do begin     let m = the minimum {non-basic} code point >= n in the input     let delta = delta + (m - n) * (h + 1), fail on overflow     let n = m     for each code point c in the input (in order) do begin       if c < n {or c is basic} then increment delta, fail on overflow       if c == n then begin         let q = delta         for k = base to infinity in steps of base do begin           let t = tmin if k <= bias {+ tmin}, or                   tmax if k >= bias + tmax, or k - bias otherwise           if q < t then break           output the code point for digit t + ((q - t) mod (base - t))           let q = (q - t) div (base - t)         end         output the code point for digit q         let bias = adapt(delta, h + 1, test h equals b?)         let delta = 0         increment h       end     end     increment delta and n   end   The full statement enclosed in braces (checking whether the input   contains a non-basic code point less than n) can be omitted if all   code points less than initial_n are basic code points (which is true   for Punycode if code points are unsigned).   The brace-enclosed conditions "non-basic" and "or c is basic" can be   omitted if initial_n exceeds all basic code points (which is true for   Punycode), because the code point being tested is never less than   initial_n.   In the assignment of t, where t is clamped to the range tmin through   tmax, "+ tmin" can always be omitted.  This makes the clamping   calculation incorrect when bias < k < bias + tmin, but that cannotCostello                    Standards Track                    [Page 12]RFC 3492                     IDNA Punycode                    March 2003   happen because of the way bias is computed and because of the   constraints on the parameters.   The checks for overflow are necessary to avoid producing invalid   output when the input contains very large values or is very long.   The increment of delta at the bottom of the outer loop cannot   overflow because delta < length(input) before the increment, and   length(input) is already assumed to be representable.  The increment   of n could overflow, but only if h == length(input), in which case   the procedure is finished anyway.6.4 Overflow handling   For IDNA, 26-bit unsigned integers are sufficient to handle all valid   IDNA labels without overflow, because any string that needed a 27-bit   delta would have to exceed either the code point limit (0..10FFFF) or   the label length limit (63 characters).  However, overflow handling   is necessary because the inputs are not necessarily valid IDNA   labels.   If the programming language does not provide overflow detection, the   following technique can be used.  Suppose A, B, and C are   representable nonnegative integers and C is nonzero.  Then A + B   overflows if and only if B > maxint - A, and A + (B * C) overflows if   and only if B > (maxint - A) div C, where maxint is the greatest   integer for which maxint + 1 cannot be represented.  Refer to   appendix C "Punycode sample implementation" for demonstrations of   this technique in the C language.   The decoding and encoding algorithms shown in sections 6.2 and 6.3   handle overflow by detecting it whenever it happens.  Another   approach is to enforce limits on the inputs that prevent overflow   from happening.  For example, if the encoder were to verify that no   input code points exceed M and that the input length does not exceed   L, then no delta could ever exceed (M - initial_n) * (L + 1), and   hence no overflow could occur if integer variables were capable of   representing values that large.  This prevention approach would   impose more restrictions on the input than the detection approach   does, but might be considered simpler in some programming languages.   In theory, the decoder could use an analogous approach, limiting the   number of digits in a variable-length integer (that is, limiting the   number of iterations in the innermost loop).  However, the number of   digits that suffice to represent a given delta can sometimes   represent much larger deltas (because of the adaptation), and hence   this approach would probably need integers wider than 32 bits.Costello                    Standards Track                    [Page 13]RFC 3492                     IDNA Punycode                    March 2003   Yet another approach for the decoder is to allow overflow to occur,   but to check the final output string by re-encoding it and comparing   to the decoder input.  If and only if they do not match (using a   case-insensitive ASCII comparison) overflow has occurred.  This   delayed-detection approach would not impose any more restrictions on   the input than the immediate-detection approach does, and might be   considered simpler in some programming languages.   In fact, if the decoder is used only inside the IDNA ToUnicode   operation [IDNA], then it need not check for overflow at all, because

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -