📄 perlebcdic.pod
字号:
<I WITH CIRCUMFLEX> 206 118 118 118 195.142 138.85 <I WITH DIAERESIS> 207 119 119 119 195.143 138.86 <CAPITAL LETTER ETH> 208 172 172 172 195.144 138.87 <N WITH TILDE> 209 105 105 105 195.145 138.88 <O WITH GRAVE> 210 237 237 237 195.146 138.89 <O WITH ACUTE> 211 238 238 238 195.147 138.98 <O WITH CIRCUMFLEX> 212 235 235 235 195.148 138.99 <O WITH TILDE> 213 239 239 239 195.149 138.100 <O WITH DIAERESIS> 214 236 236 236 195.150 138.101 <MULTIPLICATION SIGN> 215 191 191 191 195.151 138.102 <O WITH STROKE> 216 128 128 128 195.152 138.103 <U WITH GRAVE> 217 253 253 224 195.153 138.104 ### <U WITH ACUTE> 218 254 254 254 195.154 138.105 <U WITH CIRCUMFLEX> 219 251 251 221 195.155 138.106 ### <U WITH DIAERESIS> 220 252 252 252 195.156 138.112 <Y WITH ACUTE> 221 173 186 173 195.157 138.113 *** ### <CAPITAL LETTER THORN> 222 174 174 174 195.158 138.114 <SMALL LETTER SHARP S> 223 89 89 89 195.159 138.115 <a WITH GRAVE> 224 68 68 68 195.160 139.65 <a WITH ACUTE> 225 69 69 69 195.161 139.66 <a WITH CIRCUMFLEX> 226 66 66 66 195.162 139.67 <a WITH TILDE> 227 70 70 70 195.163 139.68 <a WITH DIAERESIS> 228 67 67 67 195.164 139.69 <a WITH RING ABOVE> 229 71 71 71 195.165 139.70 <SMALL LIGATURE ae> 230 156 156 156 195.166 139.71 <c WITH CEDILLA> 231 72 72 72 195.167 139.72 <e WITH GRAVE> 232 84 84 84 195.168 139.73 <e WITH ACUTE> 233 81 81 81 195.169 139.74 <e WITH CIRCUMFLEX> 234 82 82 82 195.170 139.81 <e WITH DIAERESIS> 235 83 83 83 195.171 139.82 <i WITH GRAVE> 236 88 88 88 195.172 139.83 <i WITH ACUTE> 237 85 85 85 195.173 139.84 <i WITH CIRCUMFLEX> 238 86 86 86 195.174 139.85 <i WITH DIAERESIS> 239 87 87 87 195.175 139.86 <SMALL LETTER eth> 240 140 140 140 195.176 139.87 <n WITH TILDE> 241 73 73 73 195.177 139.88 <o WITH GRAVE> 242 205 205 205 195.178 139.89 <o WITH ACUTE> 243 206 206 206 195.179 139.98 <o WITH CIRCUMFLEX> 244 203 203 203 195.180 139.99 <o WITH TILDE> 245 207 207 207 195.181 139.100 <o WITH DIAERESIS> 246 204 204 204 195.182 139.101 <DIVISION SIGN> 247 225 225 225 195.183 139.102 <o WITH STROKE> 248 112 112 112 195.184 139.103 <u WITH GRAVE> 249 221 221 192 195.185 139.104 ### <u WITH ACUTE> 250 222 222 222 195.186 139.105 <u WITH CIRCUMFLEX> 251 219 219 219 195.187 139.106 <u WITH DIAERESIS> 252 220 220 220 195.188 139.112 <y WITH ACUTE> 253 141 141 141 195.189 139.113 <SMALL LETTER thorn> 254 142 142 142 195.190 139.114 <y WITH DIAERESIS> 255 223 223 223 195.191 139.115If you would rather see the above table in CCSID 0037 order rather thanASCII + Latin-1 order then run the table through:=over 4=item recipe 4=back perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)'\ -e '{push(@l,$_)}' \ -e 'END{print map{$_->[0]}' \ -e ' sort{$a->[1] <=> $b->[1]}' \ -e ' map{[$_,substr($_,42,3)]}@l;}' perlebcdic.podIf you would rather see it in CCSID 1047 order then change the digit42 in the last line to 51, like this:=over 4=item recipe 5=back perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)'\ -e '{push(@l,$_)}' \ -e 'END{print map{$_->[0]}' \ -e ' sort{$a->[1] <=> $b->[1]}' \ -e ' map{[$_,substr($_,51,3)]}@l;}' perlebcdic.podIf you would rather see it in POSIX-BC order then change the digit51 in the last line to 60, like this:=over 4=item recipe 6=back perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)'\ -e '{push(@l,$_)}' \ -e 'END{print map{$_->[0]}' \ -e ' sort{$a->[1] <=> $b->[1]}' \ -e ' map{[$_,substr($_,60,3)]}@l;}' perlebcdic.pod=head1 IDENTIFYING CHARACTER CODE SETSTo determine the character set you are running under from perl one could use the return value of ord() or chr() to test one or more character values. For example: $is_ascii = "A" eq chr(65); $is_ebcdic = "A" eq chr(193);Also, "\t" is a C<HORIZONTAL TABULATION> character so that: $is_ascii = ord("\t") == 9; $is_ebcdic = ord("\t") == 5;To distinguish EBCDIC code pages try looking at one or more ofthe characters that differ between them. For example: $is_ebcdic_37 = "\n" eq chr(37); $is_ebcdic_1047 = "\n" eq chr(21);Or better still choose a character that is uniquely encoded in anyof the code sets, e.g.: $is_ascii = ord('[') == 91; $is_ebcdic_37 = ord('[') == 186; $is_ebcdic_1047 = ord('[') == 173; $is_ebcdic_POSIX_BC = ord('[') == 187;However, it would be unwise to write tests such as: $is_ascii = "\r" ne chr(13); # WRONG $is_ascii = "\n" ne chr(10); # ILL ADVISEDObviously the first of these will fail to distinguish most ASCII machinesfrom either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC machine since "\r" eq chr(13) under all of those coded character sets. But note too that because "\n" is chr(13) and "\r" is chr(10) on the MacIntosh (which is an ASCII machine) the second C<$is_ascii> test will lead to trouble there.To determine whether or not perl was built under an EBCDIC code page you can use the Config module like so: use Config; $is_ebcdic = $Config{'ebcdic'} eq 'define';=head1 CONVERSIONS=head2 tr///In order to convert a string of characters from one character set to another a simple list of numbers, such as in the right columns in theabove table, along with perl's tr/// operator is all that is needed. The data in the table are in ASCII order hence the EBCDIC columns provide easy to use ASCII to EBCDIC operations that are also easily reversed.For example, to convert ASCII to code page 037 take the output of the second column from the output of recipe 0 (modified to add \\ characters) and use it in tr/// like so: $cp_037 = '\000\001\002\003\234\011\206\177\227\215\216\013\014\015\016\017' . '\020\021\022\023\235\205\010\207\030\031\222\217\034\035\036\037' . '\200\201\202\203\204\012\027\033\210\211\212\213\214\005\006\007' . '\220\221\026\223\224\225\226\004\230\231\232\233\024\025\236\032' . '\040\240\342\344\340\341\343\345\347\361\242\056\074\050\053\174' . '\046\351\352\353\350\355\356\357\354\337\041\044\052\051\073\254' . '\055\057\302\304\300\301\303\305\307\321\246\054\045\137\076\077' . '\370\311\312\313\310\315\316\317\314\140\072\043\100\047\075\042' . '\330\141\142\143\144\145\146\147\150\151\253\273\360\375\376\261' . '\260\152\153\154\155\156\157\160\161\162\252\272\346\270\306\244' . '\265\176\163\164\165\166\167\170\171\172\241\277\320\335\336\256' . '\136\243\245\267\251\247\266\274\275\276\133\135\257\250\264\327' . '\173\101\102\103\104\105\106\107\110\111\255\364\366\362\363\365' . '\175\112\113\114\115\116\117\120\121\122\271\373\374\371\372\377' . '\134\367\123\124\125\126\127\130\131\132\262\324\326\322\323\325' . '\060\061\062\063\064\065\066\067\070\071\263\333\334\331\332\237' ; my $ebcdic_string = $ascii_string; eval '$ebcdic_string =~ tr/' . $cp_037 . '/\000-\377/';To convert from EBCDIC 037 to ASCII just reverse the order of the tr/// arguments like so: my $ascii_string = $ebcdic_string; eval '$ascii_string =~ tr/\000-\377/' . $cp_037 . '/';Similarly one could take the output of the third column from recipe 0 toobtain a C<$cp_1047> table. The fourth column of the output from recipe0 could provide a C<$cp_posix_bc> table suitable for transcoding as well.=head2 iconvXPG operability often implies the presence of an I<iconv> utilityavailable from the shell or from the C library. Consult your system'sdocumentation for information on iconv.On OS/390 or z/OS see the iconv(1) manpage. One way to invoke the iconv shell utility from within perl would be to: # OS/390 or z/OS example $ascii_data = `echo '$ebcdic_data'| iconv -f IBM-1047 -t ISO8859-1`or the inverse map: # OS/390 or z/OS example $ebcdic_data = `echo '$ascii_data'| iconv -f ISO8859-1 -t IBM-1047`For other perl based conversion options see the Convert::* modules on CPAN.=head2 C RTLThe OS/390 and z/OS C run time libraries provide _atoe() and _etoa() functions.=head1 OPERATOR DIFFERENCESThe C<..> range operator treats certain character ranges with care on EBCDIC machines. For example the following arraywill have twenty six elements on either an EBCDIC machineor an ASCII machine: @alphabet = ('A'..'Z'); # $#alphabet == 25The bitwise operators such as & ^ | may return different resultswhen operating on string or character data in a perl program running on an EBCDIC machine than when run on an ASCII machine. Here isan example adapted from the one in L<perlop>: # EBCDIC-based examples print "j p \n" ^ " a h"; # prints "JAPH\n" print "JA" | " ph\n"; # prints "japh\n" print "JAPH\nJunk" & "\277\277\277\277\277"; # prints "japh\n"; print 'p N$' ^ " E<H\n"; # prints "Perl\n";An interesting property of the 32 C0 control charactersin the ASCII table is that they can "literally" be constructedas control characters in perl, e.g. C<(chr(0) eq "\c@")> C<(chr(1) eq "\cA")>, and so on. Perl on EBCDIC machines has been ported to take "\c@" to chr(0) and "\cA" to chr(1) as well, but thethirty three characters that result depend on which code page you areusing. The table below uses the character names from the previous table but with substitutions such as s/START OF/S.O./; s/END OF /E.O./; s/TRANSMISSION/TRANS./; s/TABULATION/TAB./; s/VERTICAL/VERT./; s/HORIZONTAL/HORIZ./; s/DEVICE CONTROL/D.C./; s/SEPARATOR/SEP./; s/NEGATIVE ACKNOWLEDGE/NEG. ACK./;. The POSIX-BC and 1047 sets areidentical throughout this range and differ from the 0037 set at only one spot (21 decimal). Note that the C<LINE FEED> charactermay be generated by "\cJ" on ASCII machines but by "\cU" on 1047 or POSIX-BC
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -