📄 perllocale.1
字号:
.SpSubpatterns, either delivered as a list-context result or as \f(CW$1\fR etc.are tainted if \f(CW\*(C`use locale\*(C'\fR is in effect, and the subpattern regularexpression contains \f(CW\*(C`\ew\*(C'\fR (to match an alphanumeric character), \f(CW\*(C`\eW\*(C'\fR(non-alphanumeric character), \f(CW\*(C`\es\*(C'\fR (whitespace character), or \f(CW\*(C`\eS\*(C'\fR(non whitespace character). The matched-pattern variable, $&, $`(pre-match), $' (post-match), and $+ (last match) are also tainted if\&\f(CW\*(C`use locale\*(C'\fR is in effect and the regular expression contains \f(CW\*(C`\ew\*(C'\fR,\&\f(CW\*(C`\eW\*(C'\fR, \f(CW\*(C`\es\*(C'\fR, or \f(CW\*(C`\eS\*(C'\fR..IP "\(bu" 4\&\fBSubstitution operator\fR (\f(CW\*(C`s///\*(C'\fR):.SpHas the same behavior as the match operator. Also, the leftoperand of \f(CW\*(C`=~\*(C'\fR becomes tainted when \f(CW\*(C`use locale\*(C'\fR in effectif modified as a result of a substitution based on a regularexpression match involving \f(CW\*(C`\ew\*(C'\fR, \f(CW\*(C`\eW\*(C'\fR, \f(CW\*(C`\es\*(C'\fR, or \f(CW\*(C`\eS\*(C'\fR; or ofcase-mapping with \f(CW\*(C`\el\*(C'\fR, \f(CW\*(C`\eL\*(C'\fR,\f(CW\*(C`\eu\*(C'\fR or \f(CW\*(C`\eU\*(C'\fR..IP "\(bu" 4\&\fBOutput formatting functions\fR (\fIprintf()\fR and \fIwrite()\fR):.SpResults are never tainted because otherwise even output from print,for example \f(CW\*(C`print(1/7)\*(C'\fR, should be tainted if \f(CW\*(C`use locale\*(C'\fR is ineffect..IP "\(bu" 4\&\fBCase-mapping functions\fR (\fIlc()\fR, \fIlcfirst()\fR, \fIuc()\fR, \fIucfirst()\fR):.SpResults are tainted if \f(CW\*(C`use locale\*(C'\fR is in effect..IP "\(bu" 4\&\fB\s-1POSIX\s0 locale-dependent functions\fR (\fIlocaleconv()\fR, \fIstrcoll()\fR,\&\fIstrftime()\fR, \fIstrxfrm()\fR):.SpResults are never tainted..IP "\(bu" 4\&\fB\s-1POSIX\s0 character class tests\fR (\fIisalnum()\fR, \fIisalpha()\fR, \fIisdigit()\fR,\&\fIisgraph()\fR, \fIislower()\fR, \fIisprint()\fR, \fIispunct()\fR, \fIisspace()\fR, \fIisupper()\fR,\&\fIisxdigit()\fR):.SpTrue/false results are never tainted..PPThree examples illustrate locale-dependent tainting.The first program, which ignores its locale, won't run: a value takendirectly from the command line may not be used to name an output filewhen taint checks are enabled..PP.Vb 2\& #/usr/local/bin/perl \-T\& # Run with taint checking\&\& # Command line sanity check omitted...\& $tainted_output_file = shift;\&\& open(F, ">$tainted_output_file")\& or warn "Open of $untainted_output_file failed: $!\en";.Ve.PPThe program can be made to run by \*(L"laundering\*(R" the tainted value througha regular expression: the second example\*(--which still ignores localeinformation\*(--runs, creating the file named on its command lineif it can..PP.Vb 1\& #/usr/local/bin/perl \-T\&\& $tainted_output_file = shift;\& $tainted_output_file =~ m%[\ew/]+%;\& $untainted_output_file = $&;\&\& open(F, ">$untainted_output_file")\& or warn "Open of $untainted_output_file failed: $!\en";.Ve.PPCompare this with a similar but locale-aware program:.PP.Vb 1\& #/usr/local/bin/perl \-T\&\& $tainted_output_file = shift;\& use locale;\& $tainted_output_file =~ m%[\ew/]+%;\& $localized_output_file = $&;\&\& open(F, ">$localized_output_file")\& or warn "Open of $localized_output_file failed: $!\en";.Ve.PPThis third program fails to run because $& is tainted: it is the resultof a match involving \f(CW\*(C`\ew\*(C'\fR while \f(CW\*(C`use locale\*(C'\fR is in effect..SH "ENVIRONMENT".IX Header "ENVIRONMENT".IP "\s-1PERL_BADLANG\s0" 12.IX Item "PERL_BADLANG"A string that can suppress Perl's warning about failed locale settingsat startup. Failure can occur if the locale support in the operatingsystem is lacking (broken) in some way\*(--or if you mistyped the name ofa locale when you set up your environment. If this environmentvariable is absent, or has a value that does not evaluate to integerzero\*(--that is, \*(L"0\*(R" or ""\-\- Perl will complain about locale settingfailures..Sp\&\fB\s-1NOTE\s0\fR: \s-1PERL_BADLANG\s0 only gives you a way to hide the warning message.The message tells about some problem in your system's locale support,and you should investigate what the problem is..PPThe following environment variables are not specific to Perl: They arepart of the standardized (\s-1ISO\s0 C, \s-1XPG4\s0, \s-1POSIX\s0 1.c) \fIsetlocale()\fR methodfor controlling an application's opinion on data..IP "\s-1LC_ALL\s0" 12.IX Item "LC_ALL"\&\f(CW\*(C`LC_ALL\*(C'\fR is the \*(L"override-all\*(R" locale environment variable. Ifset, it overrides all the rest of the locale environment variables..IP "\s-1LANGUAGE\s0" 12.IX Item "LANGUAGE"\&\fB\s-1NOTE\s0\fR: \f(CW\*(C`LANGUAGE\*(C'\fR is a \s-1GNU\s0 extension, it affects you only if youare using the \s-1GNU\s0 libc. This is the case if you are using e.g. Linux.If you are using \*(L"commercial\*(R" UNIXes you are most probably \fInot\fRusing \s-1GNU\s0 libc and you can ignore \f(CW\*(C`LANGUAGE\*(C'\fR..SpHowever, in the case you are using \f(CW\*(C`LANGUAGE\*(C'\fR: it affects thelanguage of informational, warning, and error messages output bycommands (in other words, it's like \f(CW\*(C`LC_MESSAGES\*(C'\fR) but it has higherpriority than \s-1LC_ALL\s0. Moreover, it's not a single value butinstead a \*(L"path\*(R" (\*(L":\*(R"\-separated list) of \fIlanguages\fR (not locales).See the \s-1GNU\s0 \f(CW\*(C`gettext\*(C'\fR library documentation for more information..IP "\s-1LC_CTYPE\s0" 12.IX Item "LC_CTYPE"In the absence of \f(CW\*(C`LC_ALL\*(C'\fR, \f(CW\*(C`LC_CTYPE\*(C'\fR chooses the character typelocale. In the absence of both \f(CW\*(C`LC_ALL\*(C'\fR and \f(CW\*(C`LC_CTYPE\*(C'\fR, \f(CW\*(C`LANG\*(C'\fRchooses the character type locale..IP "\s-1LC_COLLATE\s0" 12.IX Item "LC_COLLATE"In the absence of \f(CW\*(C`LC_ALL\*(C'\fR, \f(CW\*(C`LC_COLLATE\*(C'\fR chooses the collation(sorting) locale. In the absence of both \f(CW\*(C`LC_ALL\*(C'\fR and \f(CW\*(C`LC_COLLATE\*(C'\fR,\&\f(CW\*(C`LANG\*(C'\fR chooses the collation locale..IP "\s-1LC_MONETARY\s0" 12.IX Item "LC_MONETARY"In the absence of \f(CW\*(C`LC_ALL\*(C'\fR, \f(CW\*(C`LC_MONETARY\*(C'\fR chooses the monetaryformatting locale. In the absence of both \f(CW\*(C`LC_ALL\*(C'\fR and \f(CW\*(C`LC_MONETARY\*(C'\fR,\&\f(CW\*(C`LANG\*(C'\fR chooses the monetary formatting locale..IP "\s-1LC_NUMERIC\s0" 12.IX Item "LC_NUMERIC"In the absence of \f(CW\*(C`LC_ALL\*(C'\fR, \f(CW\*(C`LC_NUMERIC\*(C'\fR chooses the numeric formatlocale. In the absence of both \f(CW\*(C`LC_ALL\*(C'\fR and \f(CW\*(C`LC_NUMERIC\*(C'\fR, \f(CW\*(C`LANG\*(C'\fRchooses the numeric format..IP "\s-1LC_TIME\s0" 12.IX Item "LC_TIME"In the absence of \f(CW\*(C`LC_ALL\*(C'\fR, \f(CW\*(C`LC_TIME\*(C'\fR chooses the date and timeformatting locale. In the absence of both \f(CW\*(C`LC_ALL\*(C'\fR and \f(CW\*(C`LC_TIME\*(C'\fR,\&\f(CW\*(C`LANG\*(C'\fR chooses the date and time formatting locale..IP "\s-1LANG\s0" 12.IX Item "LANG"\&\f(CW\*(C`LANG\*(C'\fR is the \*(L"catch-all\*(R" locale environment variable. If it is set, itis used as the last resort after the overall \f(CW\*(C`LC_ALL\*(C'\fR and thecategory-specific \f(CW\*(C`LC_...\*(C'\fR..Sh "Examples".IX Subsection "Examples"The \s-1LC_NUMERIC\s0 controls the numeric output:.PP.Vb 4\& use locale;\& use POSIX qw(locale_h); # Imports setlocale() and the LC_ constants.\& setlocale(LC_NUMERIC, "fr_FR") or die "Pardon";\& printf "%g\en", 1.23; # If the "fr_FR" succeeded, probably shows 1,23..Ve.PPand also how strings are parsed by \fIPOSIX::strtod()\fR as numbers:.PP.Vb 5\& use locale;\& use POSIX qw(locale_h strtod);\& setlocale(LC_NUMERIC, "de_DE") or die "Entschuldigung";\& my $x = strtod("2,34") + 5;\& print $x, "\en"; # Probably shows 7,34..Ve.SH "NOTES".IX Header "NOTES".Sh "Backward compatibility".IX Subsection "Backward compatibility"Versions of Perl prior to 5.004 \fBmostly\fR ignored locale information,generally behaving as if something similar to the \f(CW"C"\fR locale werealways in force, even if the program environment suggested otherwise(see \*(L"The setlocale function\*(R"). By default, Perl still behaves thisway for backward compatibility. If you want a Perl application to payattention to locale information, you \fBmust\fR use the \f(CW\*(C`use\ locale\*(C'\fRpragma (see \*(L"The use locale pragma\*(R") to instruct it to do so..PPVersions of Perl from 5.002 to 5.003 did use the \f(CW\*(C`LC_CTYPE\*(C'\fRinformation if available; that is, \f(CW\*(C`\ew\*(C'\fR did understand whatwere the letters according to the locale environment variables.The problem was that the user had no control over the feature:if the C library supported locales, Perl used them..Sh "I18N:Collate obsolete".IX Subsection "I18N:Collate obsolete"In versions of Perl prior to 5.004, per-locale collation was possibleusing the \f(CW\*(C`I18N::Collate\*(C'\fR library module. This module is now mildlyobsolete and should be avoided in new applications. The \f(CW\*(C`LC_COLLATE\*(C'\fRfunctionality is now integrated into the Perl core language: One canuse locale-specific scalar data completely normally with \f(CW\*(C`use locale\*(C'\fR,so there is no longer any need to juggle with the scalar references of\&\f(CW\*(C`I18N::Collate\*(C'\fR..Sh "Sort speed and memory use impacts".IX Subsection "Sort speed and memory use impacts"Comparing and sorting by locale is usually slower than the defaultsorting; slow-downs of two to four times have been observed. It willalso consume more memory: once a Perl scalar variable has participatedin any string comparison or sorting operation obeying the localecollation rules, it will take 3\-15 times more memory than before. (Theexact multiplier depends on the string's contents, the operating systemand the locale.) These downsides are dictated more by the operatingsystem's implementation of the locale system than by Perl..Sh "\fIwrite()\fP and \s-1LC_NUMERIC\s0".IX Subsection "write() and LC_NUMERIC"Formats are the only part of Perl that unconditionally use informationfrom a program's locale; if a program's environment specifies an\&\s-1LC_NUMERIC\s0 locale, it is always used to specify the decimal pointcharacter in formatted output. Formatted output cannot be controlled by\&\f(CW\*(C`use locale\*(C'\fR because the pragma is tied to the block structure of theprogram, and, for historical reasons, formats exist outside that blockstructure..Sh "Freely available locale definitions".IX Subsection "Freely available locale definitions"There is a large collection of locale definitions atftp://dkuug.dk/i18n/WG15\-collection . You should be aware that it isunsupported, and is not claimed to be fit for any purpose. If yoursystem allows installation of arbitrary locales, you may find thedefinitions useful as they are, or as a basis for the development ofyour own locales..Sh "I18n and l10n".IX Subsection "I18n and l10n"\&\*(L"Internationalization\*(R" is often abbreviated as \fBi18n\fR because its firstand last letters are separated by eighteen others. (You may guess whythe internalin ... internaliti ... i18n tends to get abbreviated.) Inthe same way, \*(L"localization\*(R" is often abbreviated to \fBl10n\fR..Sh "An imperfect standard".IX Subsection "An imperfect standard"Internationalization, as defined in the C and \s-1POSIX\s0 standards, can becriticized as incomplete, ungainly, and having too large a granularity.(Locales apply to a whole process, when it would arguably be more usefulto have them apply to a single thread, window group, or whatever.) Theyalso have a tendency, like standards groups, to divide the world intonations, when we all know that the world can equally well be dividedinto bankers, bikers, gamers, and so on. But, for now, it's the onlystandard we've got. This may be construed as a bug..SH "Unicode and UTF\-8".IX Header "Unicode and UTF-8"The support of Unicode is new starting from Perl version 5.6, andmore fully implemented in the version 5.8. See perluniintro andperlunicode for more details..PPUsually locale settings and Unicode do not affect each other, butthere are exceptions, see \*(L"Locales\*(R" in perlunicode for examples..SH "BUGS".IX Header "BUGS".Sh "Broken systems".IX Subsection "Broken systems"In certain systems, the operating system's locale supportis broken and cannot be fixed or used by Perl. Such deficiencies canand will result in mysterious hangs and/or Perl core dumps when the\&\f(CW\*(C`use locale\*(C'\fR is in effect. When confronted with such a system,please report in excruciating detail to <\fIperlbug@perl.org\fR>, andcomplain to your vendor: bug fixes may exist for these problemsin your operating system. Sometimes such bug fixes are called anoperating system upgrade..SH "SEE ALSO".IX Header "SEE ALSO"I18N::Langinfo, perluniintro, perlunicode, open,\&\*(L"isalnum\*(R" in \s-1POSIX\s0, \*(L"isalpha\*(R" in \s-1POSIX\s0,\&\*(L"isdigit\*(R" in \s-1POSIX\s0, \*(L"isgraph\*(R" in \s-1POSIX\s0, \*(L"islower\*(R" in \s-1POSIX\s0,\&\*(L"isprint\*(R" in \s-1POSIX\s0, \*(L"ispunct\*(R" in \s-1POSIX\s0, \*(L"isspace\*(R" in \s-1POSIX\s0,\&\*(L"isupper\*(R" in \s-1POSIX\s0, \*(L"isxdigit\*(R" in \s-1POSIX\s0, \*(L"localeconv\*(R" in \s-1POSIX\s0,\&\*(L"setlocale\*(R" in \s-1POSIX\s0, \*(L"strcoll\*(R" in \s-1POSIX\s0, \*(L"strftime\*(R" in \s-1POSIX\s0,\&\*(L"strtod\*(R" in \s-1POSIX\s0, \*(L"strxfrm\*(R" in \s-1POSIX\s0..SH "HISTORY".IX Header "HISTORY"Jarkko Hietaniemi's original \fIperli18n.pod\fR heavily hacked by DominicDunlop, assisted by the perl5\-porters. Prose worked over a bit byTom Christiansen..PPLast update: Thu Jun 11 08:44:13 \s-1MDT\s0 1998
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -