1156-1157.html

来自「linux-unix130.linux.and.unix.ebooks130 l」· HTML 代码 · 共 223 行
HTML
223 行
<HTML>

<HEAD>

<TITLE>Linux Complete Command Reference:File Formats:EarthWeb Inc.-</TITLE>

</HEAD>

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<SCRIPT>
<!--
function displayWindow(url, width, height) {
        var Win = window.open(url,"displayWindow",'width=' + width +
',height=' + height + ',resizable=1,scrollbars=yes');
}
//-->
</SCRIPT>
</HEAD>

 -->




<!-- ISBN=0672311046 //-->

<!-- TITLE=Linux Complete Command Reference//-->

<!-- AUTHOR=Red Hat//-->

<!-- PUBLISHER=Macmillan Computer Publishing//-->

<!-- IMPRINT=Sams//-->

<!-- CHAPTER=05 //-->

<!-- PAGES=1103-1208 //-->

<!-- UNASSIGNED1 //-->

<!-- UNASSIGNED2 //-->



<P><CENTER>

<a href="1155-1155.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="1158-1159.html">Next</A></CENTER></P>







<A NAME="PAGENUM-1156"><P>Page 1156</P></A>





<P><B>

INTRODUCTION

</B></P>



<P>DOS uses a different character code mapping from UNIX. Seven-bit characters still have the same meaning; only

characters with the eight-bit set are affected. To make matters worse, there are several translation tables available depending on

the country where you are. The appearance of the characters is defined using code pages. These code pages aren't the same for

all countries. For instance, some code pages don't contain upper -case accented characters. On the other hand, some code

pages contain characters that don't exist in UNIX, such as certain line-drawing characters or accented consonants used by

some Eastern European countries. This affects two things relating to filenames:

</P>



<TABLE>



<TR><TD>

Uppercase characters

</TD><TD>

In short names, only uppercase characters are allowed. This also holds for

accented characters. For instance, in a code page that doesn't contain accented uppercase

characters, the accented lowercase characters get transformed into their unaccented counterparts.

</TD></TR><TR><TD>

Long filenames

</TD><TD>

Microsoft has finally come to their senses and uses a more standard mapping for the

long filenames. They use Unicode, which is basically a 32-bit version of ASCII. Its first

256 characters are identical to UNIX ASCII. Thus, the code page also affects the

correspondence between the codes used in long names and those used in short names.

</TD></TR></TABLE>





<P>mtools considers the filenames entered on the command line as having the UNIX mapping and translates the characters

to get short names. By default, code page 850 is used with the Swiss uppercase/lowercase mapping. I chose this code

page because its set of existing characters most closely matches UNIX's. Moreover, this code page covers most characters in use

in the USA, Australia, and Western Europe. However, it is still possible to chose a different mapping. There are two

methods: the country variable and explicit tables.

</P>



<P><B>

CONFIGURATION USING COUNTRY

</B></P>



<P>The COUNTRY variable is recommended for people that also have access to MS-DOS system files and documentation. If

you don't have access to these, I'd suggest you use explicit tables instead.

</P>



<P>

Syntax: COUNTRY=&quot; country [,[ codepage ], country.sys ]&quot;

</P>



<P>This tells mtools to use a UNIX-to-DOS translation table that matches

codepage and an lowercase-to-uppercase table for

country and to use the country.sys file to get the lowercase-to-uppercase table. The country code is most often the

telephone prefix of the country. Refer to the DOS help page on

country for more details. The codepage and the

country.sys parameters are optional. Don't type in the square brackets; they are only there to indicate which parameters are optional.

The country.sys file is supplied with MS-DOS. In most cases, you don't need it because the most common translation tables

are compiled into mtools. Don't worry if you run a UNIX-only box that lacks this file.

</P>



<P>If codepage is not given, a per-country default code page is used. If the

country.sys parameter isn't given, compiled-in defaults are used for the lowercase-to-uppercase table. This is useful for other Unices than Linux, which may have

no country.sys file available online.

</P>



<P>The UNIX-to-DOS are not contained in the

country.sys file, and thus mtools always uses compiled-in defaults for

those. Thus, only a limited amount of code pages are supported. If your preferred code page is missing, or if you know the name

of the Windows 95 file that contains this mapping, drop me a line at

Alain.Knaff@inrialpes.fr.

</P>



<P>The COUNTRY variable can also be set using the environment.

</P>



<P><B>

CONFIGURTION USING EXPLICIT TRANSLATION TABLES

</B></P>



<P>Translation tables may be described in lines in the configuration file. Two tables are needed: first the DOS-to-UNIX

table and then the lowercase-to-uppercase table. A DOS-to-UNIX table starts with the

tounix keyword, followed by a colon and 128 hexadecimal numbers. A lower-to-upper table starts with the

fucase keyword, followed by a colon and 128

hexadecimal numbers.

</P>



<P>The tables only show the translations for characters whose codes is greater than 128 because translation for lower codes

is trivial. Example:

</P>



<A NAME="PAGENUM-1157"><P>Page 1157</P></A>





<!-- CODE //-->

<PRE>

tounix:





0xc7 0xfc 0xe9 0xe2 0xe4 0xe0 0xe5 0xe7

0xea 0xeb 0xe8 0xef 0xee 0xec 0xc4 0xc5

0xc9 0xe6 0xc6 0xf4 0xf6 0xf2 0xfb 0xf9

0xff 0xd6 0xdc 0xf8 0xa3 0xd8 0xd7 0x5f

0xe1 0xed 0xf3 0xfa 0xf1 0xd1 0xaa 0xba

0xbf 0xae 0xac 0xbd 0xbc 0xa1 0xab 0xbb

0x5f 0x5f 0x5f 0x5f 0x5f 0xc1 0xc2 0xc0

0xa9 0x5f 0x5f 0x5f 0x5f 0xa2 0xa5 0xac

0x5f 0x5f 0x5f 0x5f 0x5f 0x5f 0xe3 0xc3

0x5f 0x5f 0x5f 0x5f 0x5f 0x5f 0x5f 0xa4

0xf0 0xd0 0xc9 0xcb 0xc8 0x69 0xcd 0xce

0xcf 0x5f 0x5f 0x5f 0x5f 0x7c 0x49 0x5f

0xd3 0xdf 0xd4 0xd2 0xf5 0xd5 0xb5 0xfe

0xde 0xda 0xd9 0xfd 0xdd 0xde 0xaf 0xb4

0xad 0xb1 0x5f 0xbe 0xb6 0xa7 0xf7 0xb8

0xb0 0xa8 0xb7 0xb9 0xb3 0xb2 0x5f 0x5f





fucase:





0x80 0x9a 0x90 0xb6 0x8e 0xb7 0x8f 0x80

0xd2 0xd3 0xd4 0xd8 0xd7 0xde 0x8e 0x8f

0x90 0x92 0x92 0xe2 0x99 0xe3 0xea 0xeb

0x59 0x99 0x9a 0x9d 0x9c 0x9d 0x9e 0x9f

0xb5 0xd6 0xe0 0xe9 0xa5 0xa5 0xa6 0xa7

0xa8 0xa9 0xaa 0xab 0xac 0xad 0xae 0xaf

0xb0 0xb1 0xb2 0xb3 0xb4 0xb5 0xb6 0xb7

0xb8 0xb9 0xba 0xbb 0xbc 0xbd 0xbe 0xbf

0xc0 0xc1 0xc2 0xc3 0xc4 0xc5 0xc7 0xc7

0xc8 0xc9 0xca 0xcb 0xcc 0xcd 0xce 0xcf

0xd1 0xd1 0xd2 0xd3 0xd4 0x49 0xd6 0xd7

0xd8 0xd9 0xda 0xdb 0xdc 0xdd 0xde 0xdf

0xe0 0xe1 0xe2 0xe3 0xe5 0xe5 0xe6 0xe8

0xe8 0xe9 0xea 0xeb 0xed 0xed 0xee 0xef

0xf0 0xf1 0xf2 0xf3 0xf4 0xf5 0xf6 0xf7

0xf8 0xf9 0xfa 0xfb 0xfc 0xfd 0xfe 0xff

</PRE>

<!-- END CODE //-->



<P>The first table maps DOS character codes to UNIX character codes. For example, the DOS character number 129 is a u

with two dots on top of it. To translate it into UNIX, we look at the character number 1 in the first table (1 = 129 - 128). This

is 0xfc. (Beware; numbering starts at 0.) The second table maps lowercase DOS characters to uppercase DOS characters.

The same lowercase u with dots maps to character

0x9a, which is an uppercase U with dots in DOS.

</P>





<P><B>

UNICODE CHARACTERS GREATER THAN 256

</B></P>



<P>If an existing MS-DOS name contains Unicode character greater than 256, these are translated to underscores or

to characters that are close in visual appearance. For example, accented consonants are translated into their

unaccented counterparts. This translation is used for

mdir and for the UNIX filenames generated by mcopy. Linux does support

Unicode too, but unfortunately, too few applications support it yet to bother with it in

mtools. Most importantly, xterm can't display Unicode yet. If there is sufficient demand, I might include support for Unicode in the UNIX filenames as well.

</P>



<P>Caution: When deleting files with mtools, the underscore matches all characters that can't be represented in UNIX.

Be careful before mdel!

</P>



<P><B>

LOCATION OF CONFIGURATIO FILES AND PARSING ORDER

</B></P>



<P>The configuration files are parsed in the following order:

</P>



<P>Compiled-in defaults

</P>







<P><CENTER>

<a href="1155-1155.html">Previous</A> | <a href="../ewtoc.html">Table of Contents</A> | <a href="1158-1159.html">Next</A></CENTER></P>







</td>
</tr>
</table>

<!-- begin footer information -->







</body></html>
1156-1157.html - 源码说明

本页面展示了「linux-unix130.linux.and.unix.ebooks130 linux and unix ebookslinuxLearning Linux - Collection of 12 E」中的 1156-1157.html 源码文件，采用 HTML 编程语言编写，共 223 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与linux相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?