📄 unicode::normalize.3
字号:
.\" Automatically generated by Pod::Man 2.16 (Pod::Simple 3.05).\".\" Standard preamble:.\" ========================================================================.de Sh \" Subsection heading.br.if t .Sp.ne 5.PP\fB\\$1\fR.PP...de Sp \" Vertical space (when we can't use .PP).if t .sp .5v.if n .sp...de Vb \" Begin verbatim text.ft CW.nf.ne \\$1...de Ve \" End verbatim text.ft R.fi...\" Set up some character translations and predefined strings. \*(-- will.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left.\" double quote, and \*(R" will give a right double quote. \*(C+ will.\" give a nicer C++. Capital omega is used to do unbreakable dashes and.\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff,.\" nothing in troff, for use with C<>..tr \(*W-.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'.ie n \{\. ds -- \(*W-. ds PI pi. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch. ds L" "". ds R" "". ds C` "". ds C' ""'br\}.el\{\. ds -- \|\(em\|. ds PI \(*p. ds L" ``. ds R" '''br\}.\".\" Escape single quotes in literal strings from groff's Unicode transform..ie \n(.g .ds Aq \(aq.el .ds Aq '.\".\" If the F register is turned on, we'll generate index entries on stderr for.\" titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and index.\" entries marked with X<> in POD. Of course, you'll have to process the.\" output yourself in some meaningful fashion..ie \nF \{\. de IX. tm Index:\\$1\t\\n%\t"\\$2"... nr % 0. rr F.\}.el \{\. de IX...\}.\".\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2)..\" Fear. Run. Save yourself. No user-serviceable parts.. \" fudge factors for nroff and troff.if n \{\. ds #H 0. ds #V .8m. ds #F .3m. ds #[ \f1. ds #] \fP.\}.if t \{\. ds #H ((1u-(\\\\n(.fu%2u))*.13m). ds #V .6m. ds #F 0. ds #[ \&. ds #] \&.\}. \" simple accents for nroff and troff.if n \{\. ds ' \&. ds ` \&. ds ^ \&. ds , \&. ds ~ ~. ds /.\}.if t \{\. ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u". ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'. ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'. ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'. ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'. ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'.\}. \" troff and (daisy-wheel) nroff accents.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'.ds 8 \h'\*(#H'\(*b\h'-\*(#H'.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#].ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#].ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#].ds ae a\h'-(\w'a'u*4/10)'e.ds Ae A\h'-(\w'A'u*4/10)'E. \" corrections for vroff.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'. \" for low resolution devices (crt and lpr).if \n(.H>23 .if \n(.V>19 \\{\. ds : e. ds 8 ss. ds o a. ds d- d\h'-1'\(ga. ds D- D\h'-1'\(hy. ds th \o'bp'. ds Th \o'LP'. ds ae ae. ds Ae AE.\}.rm #[ #] #H #V #F C.\" ========================================================================.\".IX Title "Unicode::Normalize 3".TH Unicode::Normalize 3 "2007-12-18" "perl v5.10.0" "Perl Programmers Reference Guide".\" For nroff, turn off justification. Always turn off hyphenation; it makes.\" way too many mistakes in technical documents..if n .ad l.nh.SH "NAME"Unicode::Normalize \- Unicode Normalization Forms.SH "SYNOPSIS".IX Header "SYNOPSIS"(1) using function names exported by default:.PP.Vb 1\& use Unicode::Normalize;\&\& $NFD_string = NFD($string); # Normalization Form D\& $NFC_string = NFC($string); # Normalization Form C\& $NFKD_string = NFKD($string); # Normalization Form KD\& $NFKC_string = NFKC($string); # Normalization Form KC.Ve.PP(2) using function names exported on request:.PP.Vb 1\& use Unicode::Normalize \*(Aqnormalize\*(Aq;\&\& $NFD_string = normalize(\*(AqD\*(Aq, $string); # Normalization Form D\& $NFC_string = normalize(\*(AqC\*(Aq, $string); # Normalization Form C\& $NFKD_string = normalize(\*(AqKD\*(Aq, $string); # Normalization Form KD\& $NFKC_string = normalize(\*(AqKC\*(Aq, $string); # Normalization Form KC.Ve.SH "DESCRIPTION".IX Header "DESCRIPTION"Parameters:.PP\&\f(CW$string\fR is used as a string under character semantics (see \fIperlunicode\fR)..PP\&\f(CW$code_point\fR should be an unsigned integer representing a Unicode code point..PPNote: Between \s-1XSUB\s0 and pure Perl, there is an incompatibilityabout the interpretation of \f(CW$code_point\fR as a decimal number.\&\s-1XSUB\s0 converts \f(CW$code_point\fR to an unsigned integer, but pure Perl does not.Do not use a floating point nor a negative sign in \f(CW$code_point\fR..Sh "Normalization Forms".IX Subsection "Normalization Forms".ie n .IP """$NFD_string = NFD($string)""" 4.el .IP "\f(CW$NFD_string = NFD($string)\fR" 4.IX Item "$NFD_string = NFD($string)"It returns the Normalization Form D (formed by canonical decomposition)..ie n .IP """$NFC_string = NFC($string)""" 4.el .IP "\f(CW$NFC_string = NFC($string)\fR" 4.IX Item "$NFC_string = NFC($string)"It returns the Normalization Form C (formed by canonical decompositionfollowed by canonical composition)..ie n .IP """$NFKD_string = NFKD($string)""" 4.el .IP "\f(CW$NFKD_string = NFKD($string)\fR" 4.IX Item "$NFKD_string = NFKD($string)"It returns the Normalization Form \s-1KD\s0 (formed by compatibility decomposition)..ie n .IP """$NFKC_string = NFKC($string)""" 4.el .IP "\f(CW$NFKC_string = NFKC($string)\fR" 4.IX Item "$NFKC_string = NFKC($string)"It returns the Normalization Form \s-1KC\s0 (formed by compatibility decompositionfollowed by \fBcanonical\fR composition)..ie n .IP """$FCD_string = FCD($string)""" 4.el .IP "\f(CW$FCD_string = FCD($string)\fR" 4.IX Item "$FCD_string = FCD($string)"If the given string is in \s-1FCD\s0 (\*(L"Fast C or D\*(R" form; cf. \s-1UTN\s0 #5),it returns the string without modification; otherwise it returns an \s-1FCD\s0 string..SpNote: \s-1FCD\s0 is not always unique, then plural forms may be equivalenteach other. \f(CW\*(C`FCD()\*(C'\fR will return one of these equivalent forms..ie n .IP """$FCC_string = FCC($string)""" 4.el .IP "\f(CW$FCC_string = FCC($string)\fR" 4.IX Item "$FCC_string = FCC($string)"It returns the \s-1FCC\s0 form (\*(L"Fast C Contiguous\*(R"; cf. \s-1UTN\s0 #5)..SpNote: \s-1FCC\s0 is unique, as well as four normalization forms (NF*)..ie n .IP """$normalized_string = normalize($form_name, $string)""" 4.el .IP "\f(CW$normalized_string = normalize($form_name, $string)\fR" 4.IX Item "$normalized_string = normalize($form_name, $string)"It returns the normalization form of \f(CW$form_name\fR..SpAs \f(CW$form_name\fR, one of the following names must be given..Sp.Vb 4\& \*(AqC\*(Aq or \*(AqNFC\*(Aq for Normalization Form C (UAX #15)\& \*(AqD\*(Aq or \*(AqNFD\*(Aq for Normalization Form D (UAX #15)\& \*(AqKC\*(Aq or \*(AqNFKC\*(Aq for Normalization Form KC (UAX #15)\& \*(AqKD\*(Aq or \*(AqNFKD\*(Aq for Normalization Form KD (UAX #15)\&\& \*(AqFCD\*(Aq for "Fast C or D" Form (UTN #5)\& \*(AqFCC\*(Aq for "Fast C Contiguous" (UTN #5).Ve.Sh "Decomposition and Composition".IX Subsection "Decomposition and Composition".ie n .IP """$decomposed_string = decompose($string [, $useCompatMapping])""" 4.el .IP "\f(CW$decomposed_string = decompose($string [, $useCompatMapping])\fR" 4.IX Item "$decomposed_string = decompose($string [, $useCompatMapping])"It returns the concatenation of the decomposition of each characterin the string..SpIf the second parameter (a boolean) is omitted or false,the decomposition is canonical decomposition;if the second parameter (a boolean) is true,the decomposition is compatibility decomposition..SpThe string returned is not always in \s-1NFD/NFKD\s0. Reordering may be required..Sp.Vb 2\& $NFD_string = reorder(decompose($string)); # eq. to NFD()\& $NFKD_string = reorder(decompose($string, TRUE)); # eq. to NFKD().Ve.ie n .IP """$reordered_string = reorder($string)""" 4.el .IP "\f(CW$reordered_string = reorder($string)\fR" 4.IX Item "$reordered_string = reorder($string)"It returns the result of reordering the combining charactersaccording to Canonical Ordering Behavior..SpFor example, when you have a list of \s-1NFD/NFKD\s0 strings,you can get the concatenated \s-1NFD/NFKD\s0 string from them, by saying.Sp.Vb 2\& $concat_NFD = reorder(join \*(Aq\*(Aq, @NFD_strings);\& $concat_NFKD = reorder(join \*(Aq\*(Aq, @NFKD_strings);.Ve.ie n .IP """$composed_string = compose($string)""" 4.el .IP "\f(CW$composed_string = compose($string)\fR" 4.IX Item "$composed_string = compose($string)"It returns the result of canonical compositionwithout applying any decomposition.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -