⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 perlunicode.1

📁 视频监控网络部分的协议ddns,的模块的实现代码,请大家大胆指正.
💻 1
📖 第 1 页 / 共 5 页
字号:
.\" Automatically generated by Pod::Man 2.16 (Pod::Simple 3.05).\".\" Standard preamble:.\" ========================================================================.de Sh \" Subsection heading.br.if t .Sp.ne 5.PP\fB\\$1\fR.PP...de Sp \" Vertical space (when we can't use .PP).if t .sp .5v.if n .sp...de Vb \" Begin verbatim text.ft CW.nf.ne \\$1...de Ve \" End verbatim text.ft R.fi...\" Set up some character translations and predefined strings.  \*(-- will.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left.\" double quote, and \*(R" will give a right double quote.  \*(C+ will.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,.\" nothing in troff, for use with C<>..tr \(*W-.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'.ie n \{\.    ds -- \(*W-.    ds PI pi.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch.    ds L" "".    ds R" "".    ds C` "".    ds C' ""'br\}.el\{\.    ds -- \|\(em\|.    ds PI \(*p.    ds L" ``.    ds R" '''br\}.\".\" Escape single quotes in literal strings from groff's Unicode transform..ie \n(.g .ds Aq \(aq.el       .ds Aq '.\".\" If the F register is turned on, we'll generate index entries on stderr for.\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index.\" entries marked with X<> in POD.  Of course, you'll have to process the.\" output yourself in some meaningful fashion..ie \nF \{\.    de IX.    tm Index:\\$1\t\\n%\t"\\$2"...    nr % 0.    rr F.\}.el \{\.    de IX...\}.\".\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2)..\" Fear.  Run.  Save yourself.  No user-serviceable parts..    \" fudge factors for nroff and troff.if n \{\.    ds #H 0.    ds #V .8m.    ds #F .3m.    ds #[ \f1.    ds #] \fP.\}.if t \{\.    ds #H ((1u-(\\\\n(.fu%2u))*.13m).    ds #V .6m.    ds #F 0.    ds #[ \&.    ds #] \&.\}.    \" simple accents for nroff and troff.if n \{\.    ds ' \&.    ds ` \&.    ds ^ \&.    ds , \&.    ds ~ ~.    ds /.\}.if t \{\.    ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u".    ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'.    ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'.    ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'.    ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'.    ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'.\}.    \" troff and (daisy-wheel) nroff accents.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'.ds 8 \h'\*(#H'\(*b\h'-\*(#H'.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#].ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#].ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#].ds ae a\h'-(\w'a'u*4/10)'e.ds Ae A\h'-(\w'A'u*4/10)'E.    \" corrections for vroff.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'.    \" for low resolution devices (crt and lpr).if \n(.H>23 .if \n(.V>19 \\{\.    ds : e.    ds 8 ss.    ds o a.    ds d- d\h'-1'\(ga.    ds D- D\h'-1'\(hy.    ds th \o'bp'.    ds Th \o'LP'.    ds ae ae.    ds Ae AE.\}.rm #[ #] #H #V #F C.\" ========================================================================.\".IX Title "PERLUNICODE 1".TH PERLUNICODE 1 "2007-12-18" "perl v5.10.0" "Perl Programmers Reference Guide".\" For nroff, turn off justification.  Always turn off hyphenation; it makes.\" way too many mistakes in technical documents..if n .ad l.nh.SH "NAME"perlunicode \- Unicode support in Perl.SH "DESCRIPTION".IX Header "DESCRIPTION".Sh "Important Caveats".IX Subsection "Important Caveats"Unicode support is an extensive requirement. While Perl does notimplement the Unicode standard or the accompanying technical reportsfrom cover to cover, Perl does support many Unicode features..PPPeople who want to learn to use Unicode in Perl, should probably readthe Perl Unicode tutorial before reading this referencedocument..IP "Input and Output Layers" 4.IX Item "Input and Output Layers"Perl knows when a filehandle uses Perl's internal Unicode encodings(\s-1UTF\-8\s0, or UTF-EBCDIC if in \s-1EBCDIC\s0) if the filehandle is opened withthe \*(L":utf8\*(R" layer.  Other encodings can be converted to Perl'sencoding on input or from Perl's encoding on output by use of the\&\*(L":encoding(...)\*(R"  layer.  See open..SpTo indicate that Perl source itself is in \s-1UTF\-8\s0, use \f(CW\*(C`use utf8;\*(C'\fR..IP "Regular Expressions" 4.IX Item "Regular Expressions"The regular expression compiler produces polymorphic opcodes.  That is,the pattern adapts to the data and automatically switches to the Unicodecharacter scheme when presented with data that is internally encoded in\&\s-1UTF\-8\s0 \*(-- or instead uses a traditional byte scheme when presented withbyte data..ie n .IP """use utf8"" still needed to enable \s-1UTF\-8/UTF\-EBCDIC\s0 in scripts" 4.el .IP "\f(CWuse utf8\fR still needed to enable \s-1UTF\-8/UTF\-EBCDIC\s0 in scripts" 4.IX Item "use utf8 still needed to enable UTF-8/UTF-EBCDIC in scripts"As a compatibility measure, the \f(CW\*(C`use utf8\*(C'\fR pragma must be explicitlyincluded to enable recognition of \s-1UTF\-8\s0 in the Perl scripts themselves(in string or regular expression literals, or in identifier names) onASCII-based machines or to recognize UTF-EBCDIC on EBCDIC-basedmachines.  \fBThese are the only times when an explicit \f(CB\*(C`use utf8\*(C'\fBis needed.\fR  See utf8..IP "BOM-marked scripts and \s-1UTF\-16\s0 scripts autodetected" 4.IX Item "BOM-marked scripts and UTF-16 scripts autodetected"If a Perl script begins marked with the Unicode \s-1BOM\s0 (\s-1UTF\-16LE\s0, \s-1UTF16\-BE\s0,or \s-1UTF\-8\s0), or if the script looks like non-BOM-marked \s-1UTF\-16\s0 of eitherendianness, Perl will correctly read in the script as Unicode.(BOMless \s-1UTF\-8\s0 cannot be effectively recognized or differentiated from\&\s-1ISO\s0 8859\-1 or other eight-bit encodings.).ie n .IP """use encoding"" needed to upgrade non\-Latin\-1 byte strings" 4.el .IP "\f(CWuse encoding\fR needed to upgrade non\-Latin\-1 byte strings" 4.IX Item "use encoding needed to upgrade non-Latin-1 byte strings"By default, there is a fundamental asymmetry in Perl's Unicode model:implicit upgrading from byte strings to Unicode strings assumes thatthey were encoded in \fI\s-1ISO\s0 8859\-1 (Latin\-1)\fR, but Unicode strings aredowngraded with \s-1UTF\-8\s0 encoding.  This happens because the first 256codepoints in Unicode happens to agree with Latin\-1..SpSee \*(L"Byte and Character Semantics\*(R" for more details..Sh "Byte and Character Semantics".IX Subsection "Byte and Character Semantics"Beginning with version 5.6, Perl uses logically-wide characters torepresent strings internally..PPIn future, Perl-level operations will be expected to work withcharacters rather than bytes..PPHowever, as an interim compatibility measure, Perl aims toprovide a safe migration path from byte semantics to charactersemantics for programs.  For operations where Perl can unambiguouslydecide that the input data are characters, Perl switches tocharacter semantics.  For operations where this determination cannotbe made without additional information from the user, Perl decides infavor of compatibility and chooses to use byte semantics..PPThis behavior preserves compatibility with earlier versions of Perl,which allowed byte semantics in Perl operations only ifnone of the program's inputs were marked as being as source of Unicodecharacter data.  Such data may come from filehandles, from calls toexternal programs, from information provided by the system (such as \f(CW%ENV\fR),or from literals and constants in the source text..PPThe \f(CW\*(C`bytes\*(C'\fR pragma will always, regardless of platform, force bytesemantics in a particular lexical scope.  See bytes..PPThe \f(CW\*(C`utf8\*(C'\fR pragma is primarily a compatibility device that enablesrecognition of \s-1UTF\-\s0(8|EBCDIC) in literals encountered by the parser.Note that this pragma is only required while Perl defaults to bytesemantics; when character semantics become the default, this pragmamay become a no-op.  See utf8..PPUnless explicitly stated, Perl operators use character semanticsfor Unicode data and byte semantics for non-Unicode data.The decision to use character semantics is made transparently.  Ifinput data comes from a Unicode source\*(--for example, if a characterencoding layer is added to a filehandle or a literal Unicodestring constant appears in a program\*(--character semantics apply.Otherwise, byte semantics are in effect.  The \f(CW\*(C`bytes\*(C'\fR pragma shouldbe used to force byte semantics on Unicode data..PPIf strings operating under byte semantics and strings with Unicodecharacter data are concatenated, the new string will be created bydecoding the byte strings as \fI\s-1ISO\s0 8859\-1 (Latin\-1)\fR, even if theold Unicode string used \s-1EBCDIC\s0.  This translation is done withoutregard to the system's native 8\-bit encoding..PPUnder character semantics, many operations that formerly operated onbytes now operate on characters. A character in Perl islogically just a number ranging from 0 to 2**31 or so. Largercharacters may encode into longer sequences of bytes internally, butthis internal detail is mostly hidden for Perl code.See perluniintro for more..Sh "Effects of Character Semantics".IX Subsection "Effects of Character Semantics"Character semantics have the following effects:.IP "\(bu" 4Strings\*(--including hash keys\*(--and regular expression patterns maycontain characters that have an ordinal value larger than 255..SpIf you use a Unicode editor to edit your program, Unicode characters mayoccur directly within the literal strings in \s-1UTF\-8\s0 encoding, or \s-1UTF\-16\s0.(The former requires a \s-1BOM\s0 or \f(CW\*(C`use utf8\*(C'\fR, the latter requires a \s-1BOM\s0.).SpUnicode characters can also be added to a string by using the \f(CW\*(C`\ex{...}\*(C'\fRnotation.  The Unicode code for the desired character, in hexadecimal,should be placed in the braces. For instance, a smiley face is\&\f(CW\*(C`\ex{263A}\*(C'\fR.  This encoding scheme only works for all characters, butfor characters under 0x100, note that Perl may use an 8 bit encodinginternally, for optimization and/or backward compatibility..SpAdditionally, if you.Sp.Vb 1\&   use charnames \*(Aq:full\*(Aq;.Ve.Spyou can use the \f(CW\*(C`\eN{...}\*(C'\fR notation and put the official Unicodecharacter name within the braces, such as \f(CW\*(C`\eN{WHITE SMILING FACE}\*(C'\fR..IP "\(bu" 4If an appropriate encoding is specified, identifiers within thePerl script may contain Unicode alphanumeric characters, includingideographs.  Perl does not currently attempt to canonicalize variablenames..IP "\(bu" 4Regular expressions match characters instead of bytes.  \*(L".\*(R" matchesa character instead of a byte..IP "\(bu" 4Character classes in regular expressions match characters instead ofbytes and match against the character properties specified in theUnicode properties database.  \f(CW\*(C`\ew\*(C'\fR can be used to match a Japaneseideograph, for instance..IP "\(bu" 4Named Unicode properties, scripts, and block ranges may be used likecharacter classes via the \f(CW\*(C`\ep{}\*(C'\fR \*(L"matches property\*(R" construct andthe \f(CW\*(C`\eP{}\*(C'\fR negation, \*(L"doesn't match property\*(R"..SpSee \*(L"Unicode Character Properties\*(R" for more details..SpYou can define your own character properties and use themin the regular expression with the \f(CW\*(C`\ep{}\*(C'\fR or \f(CW\*(C`\eP{}\*(C'\fR construct..SpSee \*(L"User-Defined Character Properties\*(R" for more details..IP "\(bu" 4The special pattern \f(CW\*(C`\eX\*(C'\fR matches any extended Unicodesequence\-\-\*(L"a combining character sequence\*(R" in Standardese\*(--where thefirst character is a base character and subsequent characters are markcharacters that apply to the base character.  \f(CW\*(C`\eX\*(C'\fR is equivalent to\&\f(CW\*(C`(?:\ePM\epM*)\*(C'\fR..IP "\(bu" 4The \f(CW\*(C`tr///\*(C'\fR operator translates characters instead of bytes.  Notethat the \f(CW\*(C`tr///CU\*(C'\fR functionality has been removed.  For similarfunctionality see pack('U0', ...) and pack('C0', ...)..IP "\(bu" 4Case translation operators use the Unicode case translation tableswhen character input is provided.  Note that \f(CW\*(C`uc()\*(C'\fR, or \f(CW\*(C`\eU\*(C'\fR ininterpolated strings, translates to uppercase, while \f(CW\*(C`ucfirst\*(C'\fR,or \f(CW\*(C`\eu\*(C'\fR in interpolated strings, translates to titlecase in languagesthat make the distinction..IP "\(bu" 4Most operators that deal with positions or lengths in a string willautomatically switch to using character positions, including\&\f(CW\*(C`chop()\*(C'\fR, \f(CW\*(C`chomp()\*(C'\fR, \f(CW\*(C`substr()\*(C'\fR, \f(CW\*(C`pos()\*(C'\fR, \f(CW\*(C`index()\*(C'\fR, \f(CW\*(C`rindex()\*(C'\fR,\&\f(CW\*(C`sprintf()\*(C'\fR, \f(CW\*(C`write()\*(C'\fR, and \f(CW\*(C`length()\*(C'\fR.  An operator thatspecifically does not switch is \f(CW\*(C`vec()\*(C'\fR.  Operators that really don't care include operators that treat strings as a bucket of bits such as \&\f(CW\*(C`sort()\*(C'\fR, and operators dealing with filenames.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -