utf8.3

来自「视频监控网络部分的协议ddns,的模块的实现代码,请大家大胆指正.」· 3 代码 · 共 297 行

3
297
字号
.\" Automatically generated by Pod::Man 2.16 (Pod::Simple 3.05).\".\" Standard preamble:.\" ========================================================================.de Sh \" Subsection heading.br.if t .Sp.ne 5.PP\fB\\$1\fR.PP...de Sp \" Vertical space (when we can't use .PP).if t .sp .5v.if n .sp...de Vb \" Begin verbatim text.ft CW.nf.ne \\$1...de Ve \" End verbatim text.ft R.fi...\" Set up some character translations and predefined strings.  \*(-- will.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left.\" double quote, and \*(R" will give a right double quote.  \*(C+ will.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,.\" nothing in troff, for use with C<>..tr \(*W-.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'.ie n \{\.    ds -- \(*W-.    ds PI pi.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch.    ds L" "".    ds R" "".    ds C` "".    ds C' ""'br\}.el\{\.    ds -- \|\(em\|.    ds PI \(*p.    ds L" ``.    ds R" '''br\}.\".\" Escape single quotes in literal strings from groff's Unicode transform..ie \n(.g .ds Aq \(aq.el       .ds Aq '.\".\" If the F register is turned on, we'll generate index entries on stderr for.\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index.\" entries marked with X<> in POD.  Of course, you'll have to process the.\" output yourself in some meaningful fashion..ie \nF \{\.    de IX.    tm Index:\\$1\t\\n%\t"\\$2"...    nr % 0.    rr F.\}.el \{\.    de IX...\}.\".\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2)..\" Fear.  Run.  Save yourself.  No user-serviceable parts..    \" fudge factors for nroff and troff.if n \{\.    ds #H 0.    ds #V .8m.    ds #F .3m.    ds #[ \f1.    ds #] \fP.\}.if t \{\.    ds #H ((1u-(\\\\n(.fu%2u))*.13m).    ds #V .6m.    ds #F 0.    ds #[ \&.    ds #] \&.\}.    \" simple accents for nroff and troff.if n \{\.    ds ' \&.    ds ` \&.    ds ^ \&.    ds , \&.    ds ~ ~.    ds /.\}.if t \{\.    ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u".    ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'.    ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'.    ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'.    ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'.    ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'.\}.    \" troff and (daisy-wheel) nroff accents.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'.ds 8 \h'\*(#H'\(*b\h'-\*(#H'.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#].ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#].ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#].ds ae a\h'-(\w'a'u*4/10)'e.ds Ae A\h'-(\w'A'u*4/10)'E.    \" corrections for vroff.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'.    \" for low resolution devices (crt and lpr).if \n(.H>23 .if \n(.V>19 \\{\.    ds : e.    ds 8 ss.    ds o a.    ds d- d\h'-1'\(ga.    ds D- D\h'-1'\(hy.    ds th \o'bp'.    ds Th \o'LP'.    ds ae ae.    ds Ae AE.\}.rm #[ #] #H #V #F C.\" ========================================================================.\".IX Title "utf8 3".TH utf8 3 "2007-12-18" "perl v5.10.0" "Perl Programmers Reference Guide".\" For nroff, turn off justification.  Always turn off hyphenation; it makes.\" way too many mistakes in technical documents..if n .ad l.nh.SH "NAME"utf8 \- Perl pragma to enable/disable UTF\-8 (or UTF\-EBCDIC) in source code.SH "SYNOPSIS".IX Header "SYNOPSIS".Vb 2\&    use utf8;\&    no utf8;\&\&    # Convert a Perl scalar to/from UTF\-8.\&    $num_octets = utf8::upgrade($string);\&    $success    = utf8::downgrade($string[, FAIL_OK]);\&\&    # Change the native bytes of a Perl scalar to/from UTF\-8 bytes.\&    utf8::encode($string);\&    utf8::decode($string);\&\&    $flag = utf8::is_utf8(STRING); # since Perl 5.8.1\&    $flag = utf8::valid(STRING);.Ve.SH "DESCRIPTION".IX Header "DESCRIPTION"The \f(CW\*(C`use utf8\*(C'\fR pragma tells the Perl parser to allow \s-1UTF\-8\s0 in theprogram text in the current lexical scope (allow UTF-EBCDIC on \s-1EBCDIC\s0 basedplatforms).  The \f(CW\*(C`no utf8\*(C'\fR pragma tells Perl to switch back to treatingthe source text as literal bytes in the current lexical scope..PP\&\fBDo not use this pragma for anything else than telling Perl that yourscript is written in \s-1UTF\-8\s0.\fR The utility functions described below aredirectly usable without \f(CW\*(C`use utf8;\*(C'\fR..PPBecause it is not possible to reliably tell \s-1UTF\-8\s0 from native 8 bitencodings, you need either a Byte Order Mark at the beginning of yoursource code, or \f(CW\*(C`use utf8;\*(C'\fR, to instruct perl..PPWhen \s-1UTF\-8\s0 becomes the standard source format, this pragma willeffectively become a no-op.  For convenience in what follows the term\&\fIUTF-X\fR is used to refer to \s-1UTF\-8\s0 on \s-1ASCII\s0 and \s-1ISO\s0 Latin basedplatforms and UTF-EBCDIC on \s-1EBCDIC\s0 based platforms..PPSee also the effects of the \f(CW\*(C`\-C\*(C'\fR switch and its cousin, the\&\f(CW$ENV{PERL_UNICODE}\fR, in perlrun..PPEnabling the \f(CW\*(C`utf8\*(C'\fR pragma has the following effect:.IP "\(bu" 4Bytes in the source text that have their high-bit set will be treatedas being part of a literal UTF-X sequence.  This includes mostliterals such as identifier names, string constants, and constantregular expression patterns..SpOn \s-1EBCDIC\s0 platforms characters in the Latin 1 character set aretreated as being part of a literal UTF-EBCDIC character..PPNote that if you have bytes with the eighth bit on in your script(for example embedded Latin\-1 in your string literals), \f(CW\*(C`use utf8\*(C'\fRwill be unhappy since the bytes are most probably not well-formedUTF-X.  If you want to have such bytes under \f(CW\*(C`use utf8\*(C'\fR, you can disablethis pragma until the end the block (or file, if at top level) by\&\f(CW\*(C`no utf8;\*(C'\fR..Sh "Utility functions".IX Subsection "Utility functions"The following functions are defined in the \f(CW\*(C`utf8::\*(C'\fR package by thePerl core.  You do not need to say \f(CW\*(C`use utf8\*(C'\fR to use these and in factyou should not say that  unless you really want to have \s-1UTF\-8\s0 source code..IP "\(bu" 4\&\f(CW$num_octets\fR = utf8::upgrade($string).SpConverts in-place the internal octet sequence in the native encoding(Latin\-1 or \s-1EBCDIC\s0) to the equivalent character sequence in \fIUTF-X\fR.\&\fI\f(CI$string\fI\fR already encoded as characters does no harm.  Returns thenumber of octets necessary to represent the string as \fIUTF-X\fR.  Can beused to make sure that the \s-1UTF\-8\s0 flag is on, so that \f(CW\*(C`\ew\*(C'\fR or \f(CW\*(C`lc()\*(C'\fRwork as Unicode on strings containing characters in the range 0x80\-0xFF(on \s-1ASCII\s0 and derivatives)..Sp\&\fBNote that this function does not handle arbitrary encodings.\fRTherefore Encode is recommended for the general purposes; see alsoEncode..IP "\(bu" 4\&\f(CW$success\fR = utf8::downgrade($string[, \s-1FAIL_OK\s0]).SpConverts in-place the internal octet sequence in \fIUTF-X\fR to theequivalent octet sequence in the native encoding (Latin\-1 or \s-1EBCDIC\s0).\&\fI\f(CI$string\fI\fR already encoded as native 8 bit does no harm.  Can be used tomake sure that the \s-1UTF\-8\s0 flag is off, e.g. when you want to make surethat the \fIsubstr()\fR or \fIlength()\fR function works with the usually fasterbyte algorithm..SpFails if the original \fIUTF-X\fR sequence cannot be represented in thenative 8 bit encoding. On failure dies or, if the value of \f(CW\*(C`FAIL_OK\*(C'\fR istrue, returns false..SpReturns true on success..Sp\&\fBNote that this function does not handle arbitrary encodings.\fRTherefore Encode is recommended for the general purposes; see alsoEncode..IP "\(bu" 4utf8::encode($string).SpConverts in-place the character sequence to the corresponding octetsequence in \fIUTF-X\fR.  The \s-1UTF8\s0 flag is turned off, so that after thisoperation, the string is a byte string.  Returns nothing..Sp\&\fBNote that this function does not handle arbitrary encodings.\fRTherefore Encode is recommended for the general purposes; see alsoEncode..IP "\(bu" 4\&\f(CW$success\fR = utf8::decode($string).SpAttempts to convert in-place the octet sequence in \fIUTF-X\fR to thecorresponding character sequence.  The \s-1UTF\-8\s0 flag is turned on only ifthe source string contains multiple-byte \fIUTF-X\fR characters.  If\&\fI\f(CI$string\fI\fR is invalid as \fIUTF-X\fR, returns false; otherwise returnstrue..Sp\&\fBNote that this function does not handle arbitrary encodings.\fRTherefore Encode is recommended for the general purposes; see alsoEncode..IP "\(bu" 4\&\f(CW$flag\fR = utf8::is_utf8(\s-1STRING\s0).Sp(Since Perl 5.8.1)  Test whether \s-1STRING\s0 is in \s-1UTF\-8\s0 internally.Functionally the same as \fIEncode::is_utf8()\fR..IP "\(bu" 4\&\f(CW$flag\fR = utf8::valid(\s-1STRING\s0).Sp[\s-1INTERNAL\s0] Test whether \s-1STRING\s0 is in a consistent state regarding\&\s-1UTF\-8\s0.  Will return true is well-formed \s-1UTF\-8\s0 and has the \s-1UTF\-8\s0 flagon \fBor\fR if string is held as bytes (both these states are 'consistent').Main reason for this routine is to allow Perl's testsuite to checkthat operations have left strings in a consistent state.  You mostprobably want to use \fIutf8::is_utf8()\fR instead..PP\&\f(CW\*(C`utf8::encode\*(C'\fR is like \f(CW\*(C`utf8::upgrade\*(C'\fR, but the \s-1UTF8\s0 flag iscleared.  See perlunicode for more on the \s-1UTF8\s0 flag and the C \s-1API\s0functions \f(CW\*(C`sv_utf8_upgrade\*(C'\fR, \f(CW\*(C`sv_utf8_downgrade\*(C'\fR, \f(CW\*(C`sv_utf8_encode\*(C'\fR,and \f(CW\*(C`sv_utf8_decode\*(C'\fR, which are wrapped by the Perl functions\&\f(CW\*(C`utf8::upgrade\*(C'\fR, \f(CW\*(C`utf8::downgrade\*(C'\fR, \f(CW\*(C`utf8::encode\*(C'\fR and\&\f(CW\*(C`utf8::decode\*(C'\fR.  Also, the functions utf8::is_utf8, utf8::valid,utf8::encode, utf8::decode, utf8::upgrade, and utf8::downgrade areactually internal, and thus always available, without a \f(CW\*(C`require utf8\*(C'\fRstatement..SH "BUGS".IX Header "BUGS"One can have Unicode in identifier names, but not in package/class orsubroutine names.  While some limited functionality towards this doesexist as of Perl 5.8.0, that is more accidental than designed; use ofUnicode for the said purposes is unsupported..PPOne reason of this unfinishedness is its (currently) inherentunportability: since both package names and subroutine names may needto be mapped to file and directory names, the Unicode capability ofthe filesystem becomes important\*(-- and there unfortunately aren'tportable answers..SH "SEE ALSO".IX Header "SEE ALSO"perlunitut, perluniintro, perlrun, bytes, perlunicode

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?