📄 perllocale.pod
字号:
=head1 NAMEperllocale - Perl locale handling (internationalization and localization)=head1 DESCRIPTIONPerl supports language-specific notions of data such as "is thisa letter", "what is the uppercase equivalent of this letter", and"which of these letters comes first". These are important issues,especially for languages other than English--but also for English: itwould be naE<iuml>ve to imagine that C<A-Za-z> defines all the "letters"needed to write in English. Perl is also aware that some character otherthan '.' may be preferred as a decimal point, and that output daterepresentations may be language-specific. The process of making anapplication take account of its users' preferences in such matters iscalled B<internationalization> (often abbreviated as B<i18n>); tellingsuch an application about a particular set of preferences is known asB<localization> (B<l10n>).Perl can understand language-specific data via the standardized (ISO C,XPG4, POSIX 1.c) method called "the locale system". The locale system iscontrolled per application using one pragma, one function call, andseveral environment variables.B<NOTE>: This feature is new in Perl 5.004, and does not apply unless anapplication specifically requests it--see L<Backward compatibility>.The one exception is that write() now B<always> uses the current locale- see L<"NOTES">.=head1 PREPARING TO USE LOCALESIf Perl applications are to understand and present your datacorrectly according a locale of your choice, B<all> of the followingmust be true:=over 4=item *B<Your operating system must support the locale system>. If it does,you should find that the setlocale() function is a documented part ofits C library.=item *B<Definitions for locales that you use must be installed>. You, oryour system administrator, must make sure that this is the case. Theavailable locales, the location in which they are kept, and the mannerin which they are installed all vary from system to system. Some systemsprovide only a few, hard-wired locales and do not allow more to beadded. Others allow you to add "canned" locales provided by the systemsupplier. Still others allow you or the system administrator to defineand add arbitrary locales. (You may have to ask your supplier toprovide canned locales that are not delivered with your operatingsystem.) Read your system documentation for further illumination.=item *B<Perl must believe that the locale system is supported>. If it does,C<perl -V:d_setlocale> will say that the value for C<d_setlocale> isC<define>.=backIf you want a Perl application to process and present your dataaccording to a particular locale, the application code should includethe S<C<use locale>> pragma (see L<The use locale pragma>) whereappropriate, and B<at least one> of the following must be true:=over 4=item *B<The locale-determining environment variables (see L<"ENVIRONMENT">)must be correctly set up> at the time the application is started, eitherby yourself or by whoever set up your system account.=item *B<The application must set its own locale> using the method described inL<The setlocale function>.=back=head1 USING LOCALES=head2 The use locale pragmaBy default, Perl ignores the current locale. The S<C<use locale>>pragma tells Perl to use the current locale for some operations:=over 4=item *B<The comparison operators> (C<lt>, C<le>, C<cmp>, C<ge>, and C<gt>) andthe POSIX string collation functions strcoll() and strxfrm() useC<LC_COLLATE>. sort() is also affected if used without anexplicit comparison function, because it uses C<cmp> by default.B<Note:> C<eq> and C<ne> are unaffected by locale: they alwaysperform a byte-by-byte comparison of their scalar operands. What'smore, if C<cmp> finds that its operands are equal according to thecollation sequence specified by the current locale, it goes on toperform a byte-by-byte comparison, and only returns I<0> (equal) if theoperands are bit-for-bit identical. If you really want to know whethertwo strings--which C<eq> and C<cmp> may consider different--are equalas far as collation in the locale is concerned, see the discussion inL<Category LC_COLLATE: Collation>.=item *B<Regular expressions and case-modification functions> (uc(), lc(),ucfirst(), and lcfirst()) use C<LC_CTYPE>=item *B<The formatting functions> (printf(), sprintf() and write()) useC<LC_NUMERIC>=item *B<The POSIX date formatting function> (strftime()) uses C<LC_TIME>.=backC<LC_COLLATE>, C<LC_CTYPE>, and so on, are discussed further in L<LOCALE CATEGORIES>.The default behavior is restored with the S<C<no locale>> pragma, orupon reaching the end of block enclosing C<use locale>.The string result of any operation that uses localeinformation is tainted, as it is possible for a locale to beuntrustworthy. See L<"SECURITY">.=head2 The setlocale functionYou can switch locales as often as you wish at run time with thePOSIX::setlocale() function: # This functionality not usable prior to Perl 5.004 require 5.004; # Import locale-handling tool set from POSIX module. # This example uses: setlocale -- the function call # LC_CTYPE -- explained below use POSIX qw(locale_h); # query and save the old locale $old_locale = setlocale(LC_CTYPE); setlocale(LC_CTYPE, "fr_CA.ISO8859-1"); # LC_CTYPE now in locale "French, Canada, codeset ISO 8859-1" setlocale(LC_CTYPE, ""); # LC_CTYPE now reset to default defined by LC_ALL/LC_CTYPE/LANG # environment variables. See below for documentation. # restore the old locale setlocale(LC_CTYPE, $old_locale);The first argument of setlocale() gives the B<category>, the second theB<locale>. The category tells in what aspect of data processing youwant to apply locale-specific rules. Category names are discussed inL<LOCALE CATEGORIES> and L<"ENVIRONMENT">. The locale is the name of acollection of customization information corresponding to a particularcombination of language, country or territory, and codeset. Read on forhints on the naming of locales: not all systems name locales as in theexample.If no second argument is provided and the category is something elsethan LC_ALL, the function returns a string naming the current localefor the category. You can use this value as the second argument in asubsequent call to setlocale().If no second argument is provided and the category is LC_ALL, theresult is implementation-dependent. It may be a string ofconcatenated locales names (separator also implementation-dependent)or a single locale name. Please consult your L<setlocale(3)> fordetails.If a second argument is given and it corresponds to a valid locale,the locale for the category is set to that value, and the functionreturns the now-current locale value. You can then use this in yetanother call to setlocale(). (In some implementations, the returnvalue may sometimes differ from the value you gave as the secondargument--think of it as an alias for the value you gave.)As the example shows, if the second argument is an empty string, thecategory's locale is returned to the default specified by thecorresponding environment variables. Generally, this results in areturn to the default that was in force when Perl started up: changesto the environment made by the application after startup may or may notbe noticed, depending on your system's C library.If the second argument does not correspond to a valid locale, the localefor the category is not changed, and the function returns I<undef>.For further information about the categories, consult L<setlocale(3)>.=head2 Finding localesFor locales available in your system, consult also L<setlocale(3)> tosee whether it leads to the list of available locales (search for theI<SEE ALSO> section). If that fails, try the following command lines: locale -a nlsinfo ls /usr/lib/nls/loc ls /usr/lib/locale ls /usr/lib/nls ls /usr/share/localeand see whether they list something resembling these en_US.ISO8859-1 de_DE.ISO8859-1 ru_RU.ISO8859-5 en_US.iso88591 de_DE.iso88591 ru_RU.iso88595 en_US de_DE ru_RU en de ru english german russian english.iso88591 german.iso88591 russian.iso88595 english.roman8 russian.koi8rSadly, even though the calling interface for setlocale() has beenstandardized, names of locales and the directories where theconfiguration resides have not been. The basic form of the name isI<language_territory>B<.>I<codeset>, but the latter parts afterI<language> are not always present. The I<language> and I<country>are usually from the standards B<ISO 3166> and B<ISO 639>, thetwo-letter abbreviations for the countries and the languages of theworld, respectively. The I<codeset> part often mentions some B<ISO8859> character set, the Latin codesets. For example, C<ISO 8859-1>is the so-called "Western European codeset" that can be used to encodemost Western European languages adequately. Again, there are severalways to write even the name of that one standard. Lamentably.Two special locales are worth particular mention: "C" and "POSIX".Currently these are effectively the same locale: the difference ismainly that the first one is defined by the C standard, the second bythe POSIX standard. They define the B<default locale> in whichevery program starts in the absence of locale information in itsenvironment. (The I<default> default locale, if you will.) Its languageis (American) English and its character codeset ASCII.B<NOTE>: Not all systems have the "POSIX" locale (not all systems arePOSIX-conformant), so use "C" when you need explicitly to specify thisdefault locale.=head2 LOCALE PROBLEMSYou may encounter the following warning message at Perl startup: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LC_ALL = "En_US", LANG = (unset) are supported and installed on your system. perl: warning: Falling back to the standard locale ("C").This means that your locale settings had LC_ALL set to "En_US" andLANG exists but has no value. Perl tried to believe you but could not.Instead, Perl gave up and fell back to the "C" locale, the default localethat is supposed to work no matter what. This usually means your localesettings were wrong, they mention locales your system has never heardof, or the locale installation in your system has problems (for example,some system files are broken or missing). There are quick and temporaryfixes to these problems, as well as more thorough and lasting fixes.=head2 Temporarily fixing locale problemsThe two quickest fixes are either to render Perl silent about anylocale inconsistencies or to run Perl under the default locale "C".Perl's moaning about locale problems can be silenced by setting theenvironment variable PERL_BADLANG to a zero value, for example "0".This method really just sweeps the problem under the carpet: you tellPerl to shut up even when Perl sees that something is wrong. Do notbe surprised if later something locale-dependent misbehaves.Perl can be run under the "C" locale by setting the environmentvariable LC_ALL to "C". This method is perhaps a bit more civilizedthan the PERL_BADLANG approach, but setting LC_ALL (orother locale variables) may affect other programs as well, not justPerl. In particular, external programs run from within Perl will seethese changes. If you make the new settings permanent (read on), allprograms you run see the changes. See L<ENVIRONMENT> forthe full list of relevant environment variables and L<USING LOCALES>for their effects in Perl. Effects in other programs are easily deducible. For example, the variable LC_COLLATE may well affectyour B<sort> program (or whatever the program that arranges `records'alphabetically in your system is called).You can test out changing these variables temporarily, and if thenew settings seem to help, put those settings into your shell startupfiles. Consult your local documentation for the exact details. For inBourne-like shells (B<sh>, B<ksh>, B<bash>, B<zsh>): LC_ALL=en_US.ISO8859-1 export LC_ALLThis assumes that we saw the locale "en_US.ISO8859-1" using the commandsdiscussed above. We decided to try that instead of the above faultylocale "En_US"--and in Cshish shells (B<csh>, B<tcsh>) setenv LC_ALL en_US.ISO8859-1If you do not know what shell you have, consult your localhelpdesk or the equivalent.=head2 Permanently fixing locale problemsThe slower but superior fixes are when you may be able to yourselffix the misconfiguration of your own environment variables. Themis(sing)configuration of the whole system's locales usually requiresthe help of your friendly system administrator.First, see earlier in this document about L<Finding locales>. That tellshow to find which locales are really supported--and more importantly,installed--on your system. In our example error message, environmentvariables affecting the locale are listed in the order of decreasingimportance (and unset variables do not matter). Therefore, having
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -