📄 list.pm
字号:
require 5;package I18N::LangTags::List;# Time-stamp: "2002-02-02 20:13:58 MST"use strict;use vars qw(%Name $Debug $VERSION);$VERSION = '0.25';# POD at the end.#----------------------------------------------------------------------{# read the table out of our own POD! my $seeking = 1; my $count = 0; my($tag,$name); while(<I18N::LangTags::List::DATA>) { if($seeking) { $seeking = 0 if m/=for woohah/; } else { next unless ($tag, $name) = m/\{([-0-9a-zA-Z]+)\}(?:\s*:)?\s*([^\[\]]+)/; $name =~ s/\s*[;\.]*\s*$//g; next unless $name; ++$count; print "<$tag> <$name>\n" if $Debug; $Name{$tag} = $name; } } die "No tags read??" unless $count;}#----------------------------------------------------------------------sub name { my $tag = lc($_[0] || return); $tag =~ s/^\s+//s; $tag =~ s/\s+$//s; my $alt; if($tag =~ m/^x-(.+)/) { $alt = "i-$1"; } elsif($tag =~ m/^i-(.+)/) { $alt = "x-$1"; } else { $alt = ''; } my $subform = ''; my $name = ''; print "Input: {$tag}\n" if $Debug; while(length $tag) { last if $name = $Name{$tag}; last if $name = $Name{$alt}; if($tag =~ s/(-[a-z0-9]+)$//s) { print "Shaving off: $1 leaving $tag\n" if $Debug; $subform = "$1$subform"; # and loop around again $alt =~ s/(-[a-z0-9]+)$//s && $Debug && print " alt -> $alt\n"; } else { # we're trying to pull a subform off a primary tag. TILT! print "Aborting on: {$name}{$subform}\n" if $Debug; last; } } print "Output: {$name}{$subform}\n" if $Debug; return unless $name; # Failure return $name unless $subform; # Exact match $subform =~ s/^-//s; $subform =~ s/-$//s; return "$name (Subform \"$subform\")";}1;__DATA__=head1 NAMEI18N::LangTags::List -- tags and names for human languages=head1 SYNOPSIS use I18N::LangTags::List; print "Parlez-vous... ", join(', ', I18N::LangTags::List::name('elx') || 'unknown_language', I18N::LangTags::List::name('ar-Kw') || 'unknown_language', I18N::LangTags::List::name('en') || 'unknown_language', I18N::LangTags::List::name('en-CA') || 'unknown_language', ), "?\n";prints: Parlez-vous... Elamite, Kuwait Arabic, English, Canadian English?=head1 DESCRIPTIONThis module provides a function C<I18N::LangTags::List::name( I<langtag> ) > that takesa language tag (see L<I18N::LangTags|I18N::LangTags>)and returns the best attempt at an English name for it, orundef if it can't make sense of the tag.The function I18N::LangTags::List::name(...) is not exported.The map of tags-to-names that it uses is accessable as%I18N::LangTags::List::Name, and it's the same as the listthat follows in this documentation, which should be usefulto you even if you don't use this module.=head1 ABOUT LANGUAGE TAGSInternet language tags, as defined in RFC 3066, are a formalismfor denoting human languages. The two-letter ISO 639-1 languagecodes are well known (as "en" for English), as are their formswhen qualified by a country code ("en-US"). Less well-known are thearbitrary-length non-ISO codes (like "i-mingo"), and the recently (in 2001) introduced three-letter ISO-639-2 codes.Remember these important facts:=over=item *Language tags are not locale IDs. A locale ID is written with a "_"instead of a "-", (almost?) always matches C<m/^\w\w_\w\w\b/>, andI<means> something different than a language tag. A language tagdenotes a language. A locale ID denotes a language I<as used in>a particular place, in combination with non-linguisticlocation-specific information such as what currency is usedthere. Locales I<also> often denote character set information,as in "en_US.ISO8859-1".=item *Language tags are not for computer languages.=item *"Dialect" is not a useful term, since there is no objectivecriterion for establishing when two language-forms aredialects of eachother, or are separate languages.=item *Language tags are not case-sensitive. en-US, en-us, En-Us, etc.,are all the same tag, and denote the same language.=item *Not every language tag really refers to a single language. Somelanguage tags refer to conditions: i-default (system-message textin English plus maybe other languages), und (undeterminedlanguage). Others (notably lots of the three-letter codes) arebibliographic tags that classify whole groups of languages, aswith cus "Cushitic (Other)" (i.e., alanguage that has been classed as Cushtic, but which has no morespecific code) or the even less linguistically coherentsai for "South American Indian (Other)". Though useful inbibliography, B<SUCH TAGS ARE NOTFOR GENERAL USE>. For further guidance, email me.=item *Language tags are not country codes. In fact, they are oftendistinct codes, as with language tag ja for Japanese, andISO 3166 country code C<.jp> for Japan.=back=head1 LIST OF LANGUAGESThe first part of each item is the language tag, between{...}. Itis followed by an English name for the language or language-group.Language tags that I judge to be not for general use, are bracketed.This list is in alphabetical order by English name of the language.=for reminder The name in the =item line MUST NOT have E<...>'s in it!!=for woohah START=over=item {ab} : Abkhazianeq Abkhaz=item {ace} : Achinese=item {ach} : Acoli=item {ada} : Adangme=item {aa} : Afar=item {afh} : Afrihili(Artificial)=item {af} : Afrikaans=item [{afa} : Afro-Asiatic (Other)]=item {aka} : Akan=item {akk} : Akkadian(Historical)=item {sq} : Albanian=item {ale} : Aleut=item [{alg} : Algonquian languages]NOT Algonquin!=item [{tut} : Altaic (Other)]=item {am} : AmharicNOT Aramaic!=item {i-ami} : Amieq Amis. eq 'Amis. eq Pangca.=item [{apa} : Apache languages]=item {ar} : ArabicMany forms are mutually un-intelligible in spoken media.Notable forms:{ar-ae} UAE Arabic;{ar-bh} Bahrain Arabic;{ar-dz} Algerian Arabic;{ar-eg} Egyptian Arabic;{ar-iq} Iraqi Arabic;{ar-jo} Jordanian Arabic;{ar-kw} Kuwait Arabic;{ar-lb} Lebanese Arabic;{ar-ly} Libyan Arabic;{ar-ma} Moroccan Arabic;{ar-om} Omani Arabic;{ar-qa} Qatari Arabic;{ar-sa} Sauda Arabic;{ar-sy} Syrian Arabic;{ar-tn} Tunisian Arabic;{ar-ye} Yemen Arabic.=item {arc} : AramaicNOT Amharic! NOT Samaritan Aramaic!=item {arp} : Arapaho=item {arn} : Araucanian=item {arw} : Arawak=item {hy} : Armenian=item [{art} : Artificial (Other)]=item {as} : Assamese=item [{ath} : Athapascan languages]eq Athabaskan. eq Athapaskan. eq Athabascan.=item [{aus} : Australian languages]=item [{map} : Austronesian (Other)]=item {ava} : Avaric=item {ae} : Avestaneq Zend=item {awa} : Awadhi=item {ay} : Aymara=item {az} : Azerbaijanieq Azeri=item {ban} : Balinese=item [{bat} : Baltic (Other)]=item {bal} : Baluchi=item {bam} : Bambara=item [{bai} : Bamileke languages]=item {bad} : Banda=item [{bnt} : Bantu (Other)]=item {bas} : Basa=item {ba} : Bashkir=item {eu} : Basque=item {btk} : Batak (Indonesia)=item {bej} : Beja=item {be} : Belarusianeq Belarussian. eq Byelarussian.eq Belorussian. eq Byelorussian.eq White Russian. eq White Ruthenian.NOT Ruthenian!=item {bem} : Bemba=item {bn} : Bengalieq Bangla.=item [{ber} : Berber (Other)]=item {bho} : Bhojpuri=item {bh} : Bihari=item {bik} : Bikol=item {bin} : Bini=item {bi} : Bislamaeq Bichelamar.=item {bs} : Bosnian=item {bra} : Braj=item {br} : Breton=item {bug} : Buginese=item {bg} : Bulgarian=item {i-bnn} : Bunun=item {bua} : Buriat=item {my} : Burmese=item {cad} : Caddo=item {car} : Carib=item {ca} : Catalaneq CatalE<aacute>n. eq Catalonian.=item [{cau} : Caucasian (Other)]=item {ceb} : Cebuano=item [{cel} : Celtic (Other)]Notable forms:{cel-gaulish} Gaulish (Historical)=item [{cai} : Central American Indian (Other)]=item {chg} : Chagatai(Historical?)=item [{cmc} : Chamic languages]=item {ch} : Chamorro=item {ce} : Chechen=item {chr} : Cherokeeeq Tsalagi=item {chy} : Cheyenne=item {chb} : Chibcha(Historical) NOT Chibchan (which is a language family).=item {ny} : Chichewaeq Nyanja. eq Chinyanja.=item {zh} : ChineseMany forms are mutually un-intelligible in spoken media.Notable subforms:{zh-cn} PRC Chinese;{zh-hk} Hong Kong Chinese;{zh-mo} Macau Chinese;{zh-sg} Singapore Chinese;{zh-tw} Taiwan Chinese;{zh-guoyu} Mandarin [Putonghua/Guoyu];{zh-hakka} Hakka [formerly i-hakka];{zh-min} Hokkien;{zh-min-nan} Southern Hokkien;{zh-wuu} Shanghaiese;{zh-xiang} Hunanese;{zh-gan} Gan;{zh-yue} Cantonese.=for etc{i-hakka} Hakka (old tag)=item {chn} : Chinook Jargoneq Chinook Wawa.=item {chp} : Chipewyan=item {cho} : Choctaw=item {cu} : Church Slaviceq Old Church Slavonic.=item {chk} : Chuukeseeq Trukese. eq Chuuk. eq Truk. eq Ruk.=item {cv} : Chuvash=item {cop} : Coptic=item {kw} : Cornish=item {co} : Corsicaneq Corse.=item {cre} : CreeNOT Creek!=item {mus} : CreekNOT Cree!=item [{cpe} : English-based Creoles and pidgins (Other)]=item [{cpf} : French-based Creoles and pidgins (Other)]=item [{cpp} : Portuguese-based Creoles and pidgins (Other)]=item [{crp} : Creoles and pidgins (Other)]=item {hr} : Croatianeq Croat.=item [{cus} : Cushitic (Other)]=item {cs} : Czech=item {dak} : Dakotaeq Nakota. eq Latoka.=item {da} : Danish=item {day} : Dayak=item {i-default} : Default (Fallthru) LanguageDefined in RFC 2277, this is for tagging text(which must include English text, and might/should include textin other appropriate languages) that is emitted in a contextwhere language-negotiation wasn't possible -- in SMTP mail failuremessages, for example.=item {del} : Delaware=item {din} : Dinka=item {div} : Divehi=item {doi} : DogriNOT Dogrib!=item {dgr} : DogribNOT Dogri!=item [{dra} : Dravidian (Other)]=item {dua} : Duala=item {nl} : Dutcheq Netherlander. Notable forms:{nl-nl} Netherlands Dutch;{nl-be} Belgian Dutch.=item {dum} : Middle Dutch (ca.1050-1350)(Historical)=item {dyu} : Dyula=item {dz} : Dzongkha=item {efi} : Efik=item {egy} : Ancient Egyptian(Historical)=item {eka} : Ekajuk=item {elx} : Elamite(Historical)=item {en} : EnglishNotable forms:{en-au} Australian English;{en-bz} Belize English;{en-ca} Canadian English;{en-gb} UK English;{en-ie} Irish English;{en-jm} Jamaican English;{en-nz} New Zealand English;{en-ph} Philippine English;{en-tt} Trinidad English;{en-us} US English;{en-za} South African English;{en-zw} Zimbabwe English.=item {enm} : Old English (1100-1500)(Historical)=item {ang} : Old English (ca.450-1100)eq Anglo-Saxon. (Historical)=item {eo} : Esperanto(Artificial)=item {et} : Estonian=item {ewe} : Ewe=item {ewo} : Ewondo=item {fan} : Fang=item {fat} : Fanti=item {fo} : Faroese=item {fj} : Fijian=item {fi} : Finnish=item [{fiu} : Finno-Ugrian (Other)]eq Finno-Ugric. NOT Ugaritic!=item {fon} : Fon=item {fr} : FrenchNotable forms:{fr-fr} France French;{fr-be} Belgian French;{fr-ca} Canadian French;{fr-ch} Swiss French;{fr-lu} Luxembourg French;{fr-mc} Monaco French.=item {frm} : Middle French (ca.1400-1600)(Historical)=item {fro} : Old French (842-ca.1400)(Historical)=item {fy} : Frisian=item {fur} : Friulian=item {ful} : Fulah=item {gaa} : Ga=item {gd} : Scots GaelicNOT Scots!=item {gl} : Galleganeq Galician=item {lug} : Ganda=item {gay} : Gayo=item {gba} : Gbaya=item {gez} : Geezeq Ge'ez=item {ka} : Georgian=item {de} : GermanNotable forms:{de-at} Austrian German;{de-be} Belgian German;{de-ch} Swiss German;{de-de} Germany German;{de-li} Liechtenstein German;{de-lu} Luxembourg German.=item {gmh} : Middle High German (ca.1050-1500)(Historical)=item {goh} : Old High German (ca.750-1050)(Historical)=item [{gem} : Germanic (Other)]=item {gil} : Gilbertese=item {gon} : Gondi=item {gor} : Gorontalo=item {got} : Gothic(Historical)=item {grb} : Grebo=item {grc} : Ancient Greek(Historical) (Until 15th century or so.)=item {el} : Modern Greek(Since 15th century or so.)=item {gn} : GuaraniGuaranE<iacute>=item {gu} : Gujarati=item {gwi} : Gwich'ineq Gwichin=item {hai} : Haida=item {ha} : Hausa=item {haw} : HawaiianHawai'ian=item {he} : Hebrew(Formerly "iw".)=for etc{iw} Hebrew (old tag)=item {hz} : Herero=item {hil} : Hiligaynon=item {him} : Himachali=item {hi} : Hindi=item {ho} : Hiri Motu=item {hit} : Hittite(Historical)=item {hmn} : Hmong=item {hu} : Hungarian=item {hup} : Hupa=item {iba} : Iban=item {is} : Icelandic=item {ibo} : Igbo=item {ijo} : Ijo=item {ilo} : Iloko=item [{inc} : Indic (Other)]=item [{ine} : Indo-European (Other)]=item {id} : Indonesian(Formerly "in".)=for etc{in} Indonesian (old tag)=item {ia} : Interlingua (International Auxiliary Language Association)(Artificial) NOT Interlingue!=item {ie} : Interlingue(Artificial) NOT Interlingua!=item {iu} : InuktitutA subform of "Eskimo".=item {ik} : InupiaqA subform of "Eskimo".=item [{ira} : Iranian (Other)]=item {ga} : Irish=item {mga} : Middle Irish (900-1200)(Historical)=item {sga} : Old Irish (to 900)(Historical)=item [{iro} : Iroquoian languages]=item {it} : ItalianNotable forms:{it-it} Italy Italian;{it-ch} Swiss Italian.=item {ja} : Japanese(NOT "jp"!)=item {jw} : Javanese=item {jrb} : Judeo-Arabic=item {jpr} : Judeo-Persian=item {kab} : Kabyle=item {kac} : Kachin=item {kl} : Kalaallisuteq Greenlandic "Eskimo"=item {kam} : Kamba=item {kn} : Kannadaeq Kanarese. NOT Canadian!=item {kau} : Kanuri=item {kaa} : Kara-Kalpak=item {kar} : Karen=item {ks} : Kashmiri=item {kaw} : Kawi=item {kk} : Kazakh=item {kha} : Khasi=item {km} : Khmereq Cambodian. eq Kampuchean.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -