📄 1156-1157.html
字号:
<HTML>
<HEAD>
<TITLE>Linux Complete Command Reference:File Formats:EarthWeb Inc.-</TITLE>
</HEAD>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<SCRIPT>
<!--
function displayWindow(url, width, height) {
var Win = window.open(url,"displayWindow",'width=' + width +
',height=' + height + ',resizable=1,scrollbars=yes');
}
//-->
</SCRIPT>
</HEAD>
-->
<!-- ISBN=0672311046 //-->
<!-- TITLE=Linux Complete Command Reference//-->
<!-- AUTHOR=Red Hat//-->
<!-- PUBLISHER=Macmillan Computer Publishing//-->
<!-- IMPRINT=Sams//-->
<!-- CHAPTER=05 //-->
<!-- PAGES=1103-1208 //-->
<!-- UNASSIGNED1 //-->
<!-- UNASSIGNED2 //-->
<P><CENTER>
<a href="1155-1155.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="1158-1159.html">Next</A></CENTER></P>
<A NAME="PAGENUM-1156"><P>Page 1156</P></A>
<P><B>
INTRODUCTION
</B></P>
<P>DOS uses a different character code mapping from UNIX. Seven-bit characters still have the same meaning; only
characters with the eight-bit set are affected. To make matters worse, there are several translation tables available depending on
the country where you are. The appearance of the characters is defined using code pages. These code pages aren't the same for
all countries. For instance, some code pages don't contain upper -case accented characters. On the other hand, some code
pages contain characters that don't exist in UNIX, such as certain line-drawing characters or accented consonants used by
some Eastern European countries. This affects two things relating to filenames:
</P>
<TABLE>
<TR><TD>
Uppercase characters
</TD><TD>
In short names, only uppercase characters are allowed. This also holds for
accented characters. For instance, in a code page that doesn't contain accented uppercase
characters, the accented lowercase characters get transformed into their unaccented counterparts.
</TD></TR><TR><TD>
Long filenames
</TD><TD>
Microsoft has finally come to their senses and uses a more standard mapping for the
long filenames. They use Unicode, which is basically a 32-bit version of ASCII. Its first
256 characters are identical to UNIX ASCII. Thus, the code page also affects the
correspondence between the codes used in long names and those used in short names.
</TD></TR></TABLE>
<P>mtools considers the filenames entered on the command line as having the UNIX mapping and translates the characters
to get short names. By default, code page 850 is used with the Swiss uppercase/lowercase mapping. I chose this code
page because its set of existing characters most closely matches UNIX's. Moreover, this code page covers most characters in use
in the USA, Australia, and Western Europe. However, it is still possible to chose a different mapping. There are two
methods: the country variable and explicit tables.
</P>
<P><B>
CONFIGURATION USING COUNTRY
</B></P>
<P>The COUNTRY variable is recommended for people that also have access to MS-DOS system files and documentation. If
you don't have access to these, I'd suggest you use explicit tables instead.
</P>
<P>
Syntax: COUNTRY=" country [,[ codepage ], country.sys ]"
</P>
<P>This tells mtools to use a UNIX-to-DOS translation table that matches
codepage and an lowercase-to-uppercase table for
country and to use the country.sys file to get the lowercase-to-uppercase table. The country code is most often the
telephone prefix of the country. Refer to the DOS help page on
country for more details. The codepage and the
country.sys parameters are optional. Don't type in the square brackets; they are only there to indicate which parameters are optional.
The country.sys file is supplied with MS-DOS. In most cases, you don't need it because the most common translation tables
are compiled into mtools. Don't worry if you run a UNIX-only box that lacks this file.
</P>
<P>If codepage is not given, a per-country default code page is used. If the
country.sys parameter isn't given, compiled-in defaults are used for the lowercase-to-uppercase table. This is useful for other Unices than Linux, which may have
no country.sys file available online.
</P>
<P>The UNIX-to-DOS are not contained in the
country.sys file, and thus mtools always uses compiled-in defaults for
those. Thus, only a limited amount of code pages are supported. If your preferred code page is missing, or if you know the name
of the Windows 95 file that contains this mapping, drop me a line at
Alain.Knaff@inrialpes.fr.
</P>
<P>The COUNTRY variable can also be set using the environment.
</P>
<P><B>
CONFIGURTION USING EXPLICIT TRANSLATION TABLES
</B></P>
<P>Translation tables may be described in lines in the configuration file. Two tables are needed: first the DOS-to-UNIX
table and then the lowercase-to-uppercase table. A DOS-to-UNIX table starts with the
tounix keyword, followed by a colon and 128 hexadecimal numbers. A lower-to-upper table starts with the
fucase keyword, followed by a colon and 128
hexadecimal numbers.
</P>
<P>The tables only show the translations for characters whose codes is greater than 128 because translation for lower codes
is trivial. Example:
</P>
<A NAME="PAGENUM-1157"><P>Page 1157</P></A>
<!-- CODE //-->
<PRE>
tounix:
0xc7 0xfc 0xe9 0xe2 0xe4 0xe0 0xe5 0xe7
0xea 0xeb 0xe8 0xef 0xee 0xec 0xc4 0xc5
0xc9 0xe6 0xc6 0xf4 0xf6 0xf2 0xfb 0xf9
0xff 0xd6 0xdc 0xf8 0xa3 0xd8 0xd7 0x5f
0xe1 0xed 0xf3 0xfa 0xf1 0xd1 0xaa 0xba
0xbf 0xae 0xac 0xbd 0xbc 0xa1 0xab 0xbb
0x5f 0x5f 0x5f 0x5f 0x5f 0xc1 0xc2 0xc0
0xa9 0x5f 0x5f 0x5f 0x5f 0xa2 0xa5 0xac
0x5f 0x5f 0x5f 0x5f 0x5f 0x5f 0xe3 0xc3
0x5f 0x5f 0x5f 0x5f 0x5f 0x5f 0x5f 0xa4
0xf0 0xd0 0xc9 0xcb 0xc8 0x69 0xcd 0xce
0xcf 0x5f 0x5f 0x5f 0x5f 0x7c 0x49 0x5f
0xd3 0xdf 0xd4 0xd2 0xf5 0xd5 0xb5 0xfe
0xde 0xda 0xd9 0xfd 0xdd 0xde 0xaf 0xb4
0xad 0xb1 0x5f 0xbe 0xb6 0xa7 0xf7 0xb8
0xb0 0xa8 0xb7 0xb9 0xb3 0xb2 0x5f 0x5f
fucase:
0x80 0x9a 0x90 0xb6 0x8e 0xb7 0x8f 0x80
0xd2 0xd3 0xd4 0xd8 0xd7 0xde 0x8e 0x8f
0x90 0x92 0x92 0xe2 0x99 0xe3 0xea 0xeb
0x59 0x99 0x9a 0x9d 0x9c 0x9d 0x9e 0x9f
0xb5 0xd6 0xe0 0xe9 0xa5 0xa5 0xa6 0xa7
0xa8 0xa9 0xaa 0xab 0xac 0xad 0xae 0xaf
0xb0 0xb1 0xb2 0xb3 0xb4 0xb5 0xb6 0xb7
0xb8 0xb9 0xba 0xbb 0xbc 0xbd 0xbe 0xbf
0xc0 0xc1 0xc2 0xc3 0xc4 0xc5 0xc7 0xc7
0xc8 0xc9 0xca 0xcb 0xcc 0xcd 0xce 0xcf
0xd1 0xd1 0xd2 0xd3 0xd4 0x49 0xd6 0xd7
0xd8 0xd9 0xda 0xdb 0xdc 0xdd 0xde 0xdf
0xe0 0xe1 0xe2 0xe3 0xe5 0xe5 0xe6 0xe8
0xe8 0xe9 0xea 0xeb 0xed 0xed 0xee 0xef
0xf0 0xf1 0xf2 0xf3 0xf4 0xf5 0xf6 0xf7
0xf8 0xf9 0xfa 0xfb 0xfc 0xfd 0xfe 0xff
</PRE>
<!-- END CODE //-->
<P>The first table maps DOS character codes to UNIX character codes. For example, the DOS character number 129 is a u
with two dots on top of it. To translate it into UNIX, we look at the character number 1 in the first table (1 = 129 - 128). This
is 0xfc. (Beware; numbering starts at 0.) The second table maps lowercase DOS characters to uppercase DOS characters.
The same lowercase u with dots maps to character
0x9a, which is an uppercase U with dots in DOS.
</P>
<P><B>
UNICODE CHARACTERS GREATER THAN 256
</B></P>
<P>If an existing MS-DOS name contains Unicode character greater than 256, these are translated to underscores or
to characters that are close in visual appearance. For example, accented consonants are translated into their
unaccented counterparts. This translation is used for
mdir and for the UNIX filenames generated by mcopy. Linux does support
Unicode too, but unfortunately, too few applications support it yet to bother with it in
mtools. Most importantly, xterm can't display Unicode yet. If there is sufficient demand, I might include support for Unicode in the UNIX filenames as well.
</P>
<P>Caution: When deleting files with mtools, the underscore matches all characters that can't be represented in UNIX.
Be careful before mdel!
</P>
<P><B>
LOCATION OF CONFIGURATIO FILES AND PARSING ORDER
</B></P>
<P>The configuration files are parsed in the following order:
</P>
<P>Compiled-in defaults
</P>
<P><CENTER>
<a href="1155-1155.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="1158-1159.html">Next</A></CENTER></P>
</td>
</tr>
</table>
<!-- begin footer information -->
</body></html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -