📄 perllocale.pod
字号:
LC_ALL set to "En_US" must have been the bad choice, as shown by theerror message. First try fixing locale settings listed first.Second, if using the listed commands you see something B<exactly>(prefix matches do not count and case usually counts) like "En_US"without the quotes, then you should be okay because you are using alocale name that should be installed and available in your system.In this case, see L<Permanently fixing your system's locale configuration>.=head2 Permanently fixing your system's locale configurationThis is when you see something like: perl: warning: Please check that your locale settings: LC_ALL = "En_US", LANG = (unset) are supported and installed on your system.but then cannot see that "En_US" listed by the above-mentionedcommands. You may see things like "en_US.ISO8859-1", but that isn'tthe same. In this case, try running under a localethat you can list and which somehow matches what you tried. Therules for matching locale names are a bit vague becausestandardization is weak in this area. See again the L<Finding locales> about general rules.=head2 Fixing system locale configurationContact a system administrator (preferably your own) and report the exacterror message you get, and ask them to read this same documentation youare now reading. They should be able to check whether there is somethingwrong with the locale configuration of the system. The L<Finding locales>section is unfortunately a bit vague about the exact commands and placesbecause these things are not that standardized.=head2 The localeconv functionThe POSIX::localeconv() function allows you to get particulars of thelocale-dependent numeric formatting information specified by the currentC<LC_NUMERIC> and C<LC_MONETARY> locales. (If you just want the name ofthe current locale for a particular category, use POSIX::setlocale()with a single parameter--see L<The setlocale function>.) use POSIX qw(locale_h); # Get a reference to a hash of locale-dependent info $locale_values = localeconv(); # Output sorted list of the values for (sort keys %$locale_values) { printf "%-20s = %s\n", $_, $locale_values->{$_} }localeconv() takes no arguments, and returns B<a reference to> a hash.The keys of this hash are variable names for formatting, such asC<decimal_point> and C<thousands_sep>. The values are thecorresponding, er, values. See L<POSIX/localeconv> for a longerexample listing the categories an implementation might be expected toprovide; some provide more and others fewer. You don't need anexplicit C<use locale>, because localeconv() always observes thecurrent locale.Here's a simple-minded example program that rewrites its command-lineparameters as integers correctly formatted in the current locale: # See comments in previous example require 5.004; use POSIX qw(locale_h); # Get some of locale's numeric formatting parameters my ($thousands_sep, $grouping) = @{localeconv()}{'thousands_sep', 'grouping'}; # Apply defaults if values are missing $thousands_sep = ',' unless $thousands_sep; # grouping and mon_grouping are packed lists # of small integers (characters) telling the # grouping (thousand_seps and mon_thousand_seps # being the group dividers) of numbers and # monetary quantities. The integers' meanings: # 255 means no more grouping, 0 means repeat # the previous grouping, 1-254 means use that # as the current grouping. Grouping goes from # right to left (low to high digits). In the # below we cheat slightly by never using anything # else than the first grouping (whatever that is). if ($grouping) { @grouping = unpack("C*", $grouping); } else { @grouping = (3); } # Format command line params for current locale for (@ARGV) { $_ = int; # Chop non-integer part 1 while s/(\d)(\d{$grouping[0]}($|$thousands_sep))/$1$thousands_sep$2/; print "$_"; } print "\n";=head1 LOCALE CATEGORIESThe following subsections describe basic locale categories. Beyond these,some combination categories allow manipulation of more than onebasic category at a time. See L<"ENVIRONMENT"> for a discussion of these.=head2 Category LC_COLLATE: CollationIn the scope of S<C<use locale>>, Perl looks to the C<LC_COLLATE>environment variable to determine the application's notions on collation(ordering) of characters. For example, 'b' follows 'a' in Latinalphabets, but where do 'E<aacute>' and 'E<aring>' belong? And while'color' follows 'chocolate' in English, what about in Spanish?The following collations all make sense and you may meet any of themif you "use locale". A B C D E a b c d e A a B b C c D d E e a A b B c C d D e E a b c d e A B C D EHere is a code snippet to tell what "word"characters are in the current locale, in that locale's order: use locale; print +(sort grep /\w/, map { chr } 0..255), "\n";Compare this with the characters that you see and their order if youstate explicitly that the locale should be ignored: no locale; print +(sort grep /\w/, map { chr } 0..255), "\n";This machine-native collation (which is what you get unless S<C<uselocale>> has appeared earlier in the same block) must be used forsorting raw binary data, whereas the locale-dependent collation of thefirst example is useful for natural text.As noted in L<USING LOCALES>, C<cmp> compares according to the currentcollation locale when C<use locale> is in effect, but falls back to abyte-by-byte comparison for strings that the locale says are equal. Youcan use POSIX::strcoll() if you don't want this fall-back: use POSIX qw(strcoll); $equal_in_locale = !strcoll("space and case ignored", "SpaceAndCaseIgnored");$equal_in_locale will be true if the collation locale specifies adictionary-like ordering that ignores space characters completely andwhich folds case.If you have a single string that you want to check for "equality inlocale" against several others, you might think you could gain a littleefficiency by using POSIX::strxfrm() in conjunction with C<eq>: use POSIX qw(strxfrm); $xfrm_string = strxfrm("Mixed-case string"); print "locale collation ignores spaces\n" if $xfrm_string eq strxfrm("Mixed-casestring"); print "locale collation ignores hyphens\n" if $xfrm_string eq strxfrm("Mixedcase string"); print "locale collation ignores case\n" if $xfrm_string eq strxfrm("mixed-case string");strxfrm() takes a string and maps it into a transformed string for usein byte-by-byte comparisons against other transformed strings duringcollation. "Under the hood", locale-affected Perl comparison operatorscall strxfrm() for both operands, then do a byte-by-bytecomparison of the transformed strings. By calling strxfrm() explicitlyand using a non locale-affected comparison, the example attempts to savea couple of transformations. But in fact, it doesn't save anything: Perlmagic (see L<perlguts/Magic Variables>) creates the transformed version of astring the first time it's needed in a comparison, then keeps this version aroundin case it's needed again. An example rewritten the easy way withC<cmp> runs just about as fast. It also copes with null charactersembedded in strings; if you call strxfrm() directly, it treats the firstnull it finds as a terminator. don't expect the transformed stringsit produces to be portable across systems--or even from one revisionof your operating system to the next. In short, don't call strxfrm()directly: let Perl do it for you.Note: C<use locale> isn't shown in some of these examples because it isn'tneeded: strcoll() and strxfrm() exist only to generate locale-dependentresults, and so always obey the current C<LC_COLLATE> locale.=head2 Category LC_CTYPE: Character TypesIn the scope of S<C<use locale>>, Perl obeys the C<LC_CTYPE> localesetting. This controls the application's notion of which characters arealphabetic. This affects Perl's C<\w> regular expression metanotation,which stands for alphanumeric characters--that is, alphabetic,numeric, and including other special characters such as the underscore orhyphen. (Consult L<perlre> for more information aboutregular expressions.) Thanks to C<LC_CTYPE>, depending on your localesetting, characters like 'E<aelig>', 'E<eth>', 'E<szlig>', and'E<oslash>' may be understood as C<\w> characters.The C<LC_CTYPE> locale also provides the map used in transliteratingcharacters between lower and uppercase. This affects the case-mappingfunctions--lc(), lcfirst, uc(), and ucfirst(); case-mappinginterpolation with C<\l>, C<\L>, C<\u>, or C<\U> in double-quoted stringsand C<s///> substitutions; and case-independent regular expressionpattern matching using the C<i> modifier.Finally, C<LC_CTYPE> affects the POSIX character-class testfunctions--isalpha(), islower(), and so on. For example, if you movefrom the "C" locale to a 7-bit Scandinavian one, you may find--possiblyto your surprise--that "|" moves from the ispunct() class to isalpha().B<Note:> A broken or malicious C<LC_CTYPE> locale definition may resultin clearly ineligible characters being considered to be alphanumeric byyour application. For strict matching of (mundane) letters anddigits--for example, in command strings--locale-aware applicationsshould use C<\w> inside a C<no locale> block. See L<"SECURITY">.=head2 Category LC_NUMERIC: Numeric FormattingIn the scope of S<C<use locale>>, Perl obeys the C<LC_NUMERIC> localeinformation, which controls an application's idea of how numbers shouldbe formatted for human readability by the printf(), sprintf(), andwrite() functions. String-to-numeric conversion by the POSIX::strtod()function is also affected. In most implementations the only effect is tochange the character used for the decimal point--perhaps from '.' to ','.These functions aren't aware of such niceties as thousands separation andso on. (See L<The localeconv function> if you care about these things.)Output produced by print() is also affected by the current locale: itdepends on whether C<use locale> or C<no locale> is in effect, andcorresponds to what you'd get from printf() in the "C" locale. Thesame is true for Perl's internal conversions between numeric andstring formats: use POSIX qw(strtod); use locale; $n = 5/2; # Assign numeric 2.5 to $n $a = " $n"; # Locale-dependent conversion to string print "half five is $n\n"; # Locale-dependent output printf "half five is %g\n", $n; # Locale-dependent output print "DECIMAL POINT IS COMMA\n" if $n == (strtod("2,5"))[0]; # Locale-dependent conversion=head2 Category LC_MONETARY: Formatting of monetary amountsThe C standard defines the C<LC_MONETARY> category, but no functionthat is affected by its contents. (Those with experience of standardscommittees will recognize that the working group decided to punt on theissue.) Consequently, Perl takes no notice of it. If you really wantto use C<LC_MONETARY>, you can query its contents--see L<The localeconv function>--and use the information that it returns in your application's own formatting of currency amounts. However, you may well find that the information, voluminous and complex though it may be, still does not quite meet your requirements: currency formatting is a hard nut to crack.=head2 LC_TIMEOutput produced by POSIX::strftime(), which builds a formattedhuman-readable date/time string, is affected by the current C<LC_TIME>locale. Thus, in a French locale, the output produced by the C<%B>format element (full month name) for the first month of the year wouldbe "janvier". Here's how to get a list of long month names in thecurrent locale: use POSIX qw(strftime); for (0..11) { $long_month_name[$_] = strftime("%B", 0, 0, 0, 1, $_, 96); }Note: C<use locale> isn't needed in this example: as a function thatexists only to generate locale-dependent results, strftime() alwaysobeys the current C<LC_TIME> locale.=head2 Other categoriesThe remaining locale category, C<LC_MESSAGES> (possibly supplementedby others in particular implementations) is not currently used byPerl--except possibly to affect the behavior of library functionscalled by extensions outside the standard Perl distribution and by theoperating system and its utilities. Note especially that the stringvalue of C<$!> and the error messages given by external utilities maybe changed by C<LC_MESSAGES>. If you want to have portable errorcodes, use C<%!>. See L<Errno>.=head1 SECURITYAlthough the main discussion of Perl security issues can be found inL<perlsec>, a discussion of Perl's locale handling would be incompleteif it did not draw your attention to locale-dependent security issues.Locales--particularly on systems that allow unprivileged users tobuild their own locales--are untrustworthy. A malicious (or just plainbroken) locale can make a locale-aware application give unexpectedresults. Here are a few possibilities:=over 4=item *Regular expression checks for safe file names or mail addresses usingC<\w> may be spoofed by an C<LC_CTYPE> locale that claims thatcharacters such as "E<gt>" and "|" are alphanumeric.=item *String interpolation with case-mapping, as in, say, C<$dest ="C:\U$name.$ext">, may produce dangerous results if a bogus LC_CTYPEcase-mapping table is in effect.=item *A sneaky C<LC_COLLATE> locale could result in the names of students with"D" grades appearing ahead of those with "A"s.=item *An application that takes the trouble to use information inC<LC_MONETARY> may format debits as if they were credits and vice versaif that locale has been subverted. Or it might make payments in USdollars instead of Hong Kong dollars.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -