📄 node5.html

📁 适合python初学者,一本很好的python学习书籍
💻 HTML
📖 第 1 页 / 共 2 页
字号:
上一页 12
生成以下输出：<P><div class="verbatim"><pre>
Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to</pre></div><P>The interpreter prints the result of string operations in the same way
as they are typed for input: inside quotes, and with quotes and other
funny characters escaped by backslashes, to show the precise
value.  The string is enclosed in double quotes if the string contains
a single quote and no double quotes, else it's enclosed in single
quotes.  (The <tt class="keyword">print</tt> statement, described later, can be used
to write strings without quotes or escapes.)<P>解释器打印出来的字符串与它们输入的形式完全相同：内部的引号，用反斜杠标识的引号和各种怪字符，都精确的显示出来。如果字符串中包含单引号，不包含双引号，可以用双引号引用它，反之可以用单引号。（后面介绍的
print 语句，可以在不使用引号和反斜杠的情况下输出字符串）。<P>Strings can be concatenated (glued together) with the
<code>+</code> operator, and repeated with <code>*</code>:<P>字符串可以用 + 号联接（或者说粘合），也可以用 * 号循环。<P><div class="verbatim"><pre>
&gt;&gt;&gt; word = 'Help' + 'A'
&gt;&gt;&gt; word
'HelpA'
&gt;&gt;&gt; '&lt;' + word*5 + '&gt;'
'&lt;HelpAHelpAHelpAHelpAHelpA&gt;'</pre></div><P>Two string literals next to each other are automatically concatenated;
the first line above could also have been written "<tt class="samp">word = 'Help'
'A'</tt>"; this only works with two literals, not with arbitrary string
expressions:<P>两个字符串值之间会自动联接，上例第一行可以写成“word = 'Help' 'A'”。这种方式只对字符串值有效，任何字符串表达式都不适用这种方法。<P><div class="verbatim"><pre>
&gt;&gt;&gt; 'str' 'ing'                   #  &lt;-  This is ok
'string'
&gt;&gt;&gt; 'str'.strip() + 'ing'   #  &lt;-  This is ok
'string'
&gt;&gt;&gt; 'str'.strip() 'ing'     #  &lt;-  This is invalid
  File "&lt;stdin&gt;", line 1, in ?
    'str'.strip() 'ing'
                      ^
SyntaxError: invalid syntax</pre></div><P>Strings can be subscripted (indexed); like in C, the first character
of a string has subscript (index) 0.  There is no separate character
type; a character is simply a string of size one.  Like in Icon,
substrings can be specified with the <em>slice notation</em>: two indices
separated by a colon.<P>字符串可以用下标（索引）查询；就像 C 一样，字符串的第一个字符下标是 0。这里没有独立的字符类型，字符仅仅是大小为一的字符串。就像在 Icon 中那样，字符串的子串可以通过切片标志来表示：两个由冒号隔开的索引。<P><div class="verbatim"><pre>
&gt;&gt;&gt; word[4]
'A'
&gt;&gt;&gt; word[0:2]
'He'
&gt;&gt;&gt; word[2:4]
'lp'</pre></div><P>Slice indices have useful defaults; an omitted first index defaults to
zero, an omitted second index defaults to the size of the string being
sliced.<P>切片索引可以使用默认值；前一个索引默认值为 0，后一个索引默认值为被切片的字符串的长度。<P><div class="verbatim"><pre>
&gt;&gt;&gt; word[:2]    # The first two characters
'He'
&gt;&gt;&gt; word[2:]    # Everything except the first two characters
'lpA'</pre></div><P>Unlike a C string, Python strings cannot be changed.  Assigning to an
indexed position in the string results in an error:<P>和 C 字符串不同， Python 字符串不能改写。按字符串索引赋值会产生错误。<P><div class="verbatim"><pre>
&gt;&gt;&gt; word[0] = 'x'
Traceback (most recent call last):
  File "&lt;stdin&gt;", line 1, in ?
TypeError: object doesn't support item assignment
&gt;&gt;&gt; word[:1] = 'Splat'
Traceback (most recent call last):
  File "&lt;stdin&gt;", line 1, in ?
TypeError: object doesn't support slice assignment</pre></div><P>However, creating a new string with the combined content is easy and
efficient:<P>然而，可以通过简单有效的组合方式生成新的字符串：<P><div class="verbatim"><pre>
&gt;&gt;&gt; 'x' + word[1:]
'xelpA'
&gt;&gt;&gt; 'Splat' + word[4]
'SplatA'</pre></div><P>Here's a useful invariant of slice operations:
<code>s[:i] + s[i:]</code> equals <code>s</code>.<P>切片操作有一个很有用的不变性：<P><div class="verbatim"><pre>
&gt;&gt;&gt; word[:2] + word[2:]
'HelpA'
&gt;&gt;&gt; word[:3] + word[3:]
'HelpA'</pre></div><P>Degenerate slice indices are handled gracefully: an index that is too
large is replaced by the string size, an upper bound smaller than the
lower bound returns an empty string.<P>退化的切片索引处理方式很优美：过大的索引代替为字符串大小，下界比上界大的返回空字符串。<P><div class="verbatim"><pre>
&gt;&gt;&gt; word[1:100]
'elpA'
&gt;&gt;&gt; word[10:]
''
&gt;&gt;&gt; word[2:1]
''</pre></div><P>Indices may be negative numbers, to start counting from the right.
For example:<P>索引可以是负数，计数从右边开始，例如：<P><div class="verbatim"><pre>
&gt;&gt;&gt; word[-1]     # The last character
'A'
&gt;&gt;&gt; word[-2]     # The last-but-one character
'p'
&gt;&gt;&gt; word[-2:]    # The last two characters
'pA'
&gt;&gt;&gt; word[:-2]    # Everything except the last two characters
'Hel'</pre></div><P>But note that -0 is really the same as 0, so it does not count from
the right!<P>不过需要注意的是-0还是0，它没有从右边计数！<P><div class="verbatim"><pre>
&gt;&gt;&gt; word[-0]     # (since -0 equals 0)
'H'</pre></div><P>Out-of-range negative slice indices are truncated, but don't try this
for single-element (non-slice) indices:<P>越界的负切片索引会被截断，不过不要尝试在单元素索引（非切片的）中这样做：<P><div class="verbatim"><pre>
&gt;&gt;&gt; word[-100:]
'HelpA'
&gt;&gt;&gt; word[-10]    # error
Traceback (most recent call last):
  File "&lt;stdin&gt;", line 1, in ?
IndexError: string index out of range</pre></div><P>The best way to remember how slices work is to think of the indices as
pointing <em>between</em> characters, with the left edge of the first
character numbered 0.  Then the right edge of the last character of a
string of <var>n</var> characters has index <var>n</var>, for example:<P>理解切片的最好方式是把索引视为两个字符之间的点，第一个字符的左边是0，字符串中第n个字符的右边是索引n，例如：<P><div class="verbatim"><pre>
 +---+---+---+---+---+
 | H | e | l | p | A |
 +---+---+---+---+---+
 0   1   2   3   4   5
-5  -4  -3  -2  -1</pre></div><P>The first row of numbers gives the position of the indices 0...5 in
the string; the second row gives the corresponding negative indices.
The slice from <var>i</var> to <var>j</var> consists of all characters between
the edges labeled <var>i</var> and <var>j</var>, respectively.<P>第一行是字符串中给定的0到5各个索引的位置，第二行是对应的负索引。从i到j的切片由这两个标志之间的字符组成。<P>For non-negative indices, the length of a slice is the difference of
the indices, if both are within bounds.  For example, the length of
<code>word[1:3]</code> is 2.<P>对于非负索引，切片长度就是两索引的差。例如，word[1:3]的长度是2。<P>The built-in function <tt class="function">len()</tt> returns the length of a string:<P>内置函数 len() 返回字符串长度：<P><div class="verbatim"><pre>
&gt;&gt;&gt; s = 'supercalifragilisticexpialidocious'
&gt;&gt;&gt; len(s)
34</pre></div><P><div class="seealso">  <p class="heading">See Also:</p>
  <dl compact="compact" class="seetitle">    <dt><em class="citetitle"><a href="../lib/typesseq.html"        >Sequence Types</a></em></dt>    <dd>Strings, and the Unicode strings described in the next
            section, are examples of <em>sequence types</em>, and
            support the common operations supported by such types.</dd>  </dl>
  <dl compact="compact" class="seetitle">    <dt><em class="citetitle"><a href="../lib/string-methods.html"        >String Methods</a></em></dt>    <dd>Both strings and Unicode strings support a large number of
            methods for basic transformations and searching.</dd>  </dl>
  <dl compact="compact" class="seetitle">    <dt><em class="citetitle"><a href="../lib/typesseq-strings.html"        >String Formatting Operations</a></em></dt>    <dd>The formatting operations invoked when strings and Unicode
            strings are the left operand of the <code>%</code> operator are
            described in more detail here.</dd>  </dl>
</div><P><H2><A NAME="SECTION005130000000000000000"></A><A NAME="unicodeStrings"></A><BR>3.1.3 Unicode 字符串 Unicode Strings </H2><P>Starting with Python 2.0 a new data type for storing text data is
available to the programmer: the Unicode object. It can be used to
store and manipulate Unicode data (see <a class="url" href="http://www.unicode.org/">http://www.unicode.org/</a>)
and integrates well with the existing string objects providing
auto-conversions where necessary.<P>从Python2.0开始，程序员们可以使用一种新的数据类型来存储文本数据：Unicode 对象。它可以用于存储多种Unicode数据（请参阅 http://www.unicode.org/ ），并且，通过必要时的自动转换，它可以与现有的字符串对象良好的结合。<P>Unicode has the advantage of providing one ordinal for every character
in every script used in modern and ancient texts. Previously, there
were only 256 possible ordinals for script characters and texts were
typically bound to a code page which mapped the ordinals to script
characters. This lead to very much confusion especially with respect
to internationalization (usually written as "<tt class="samp">i18n</tt>" --
"<tt class="character">i</tt>" + 18 characters + "<tt class="character">n</tt>") of software.  Unicode
solves these problems by defining one code page for all scripts.<P>Unicode
针对现代和旧式的文本中所有的字符提供了一个序列。以前，字符只能使用256个序号，文本通常通过绑定代码页来与字符映射。这很容易导致混乱，特别是软件的国际化（ internationalization －－通常写做“i18n”－－“i”+ "<tt class="character">i</tt>"  +“n”）。 Unicode 通过为所有字符定义一个统一的代码页解决了这个问题。<P>Creating Unicode strings in Python is just as simple as creating
normal strings:<P>Python 中定义一个 Unicode 字符串和定义一个普通字符串一样简单：<P><div class="verbatim"><pre>
&gt;&gt;&gt; u'Hello World !'
u'Hello World !'</pre></div><P>The small "<tt class="character">u</tt>" in front of the quote indicates that an
Unicode string is supposed to be created. If you want to include
special characters in the string, you can do so by using the Python
<em>Unicode-Escape</em> encoding. The following example shows how:<P>引号前小写的“u”表示这里创建的是一个 Unicode
字符串。如果你想加入一个特殊字符，可以使用 Python 的
Unicode-Escape 编码。如下例所示：<P><div class="verbatim"><pre>
&gt;&gt;&gt; u'Hello\u0020World !'
u'Hello World !'</pre></div><P>The escape sequence <code>&#92;u0020</code> indicates to insert the Unicode
character with the ordinal value 0x0020 (the space character) at the
given position.<P>被替换的 <code>&#92;u0020</code> 标识表示在给定位置插入编码值为 0x0020 的
Unicode 字符（空格符）。<P>Other characters are interpreted by using their respective ordinal
values directly as Unicode ordinals.  If you have literal strings
in the standard Latin-1 encoding that is used in many Western countries,
you will find it convenient that the lower 256 characters
of Unicode are the same as the 256 characters of Latin-1.<P>其它字符也会被直接解释成对应的 Unicode
码。如果你有一个在西方国家常用的 Latin-1 编码字符串，你可以发现
Unicode 字符集的前256个字符与 Latin-1 的对应字符编码完全相同。<P>For experts, there is also a raw mode just like the one for normal
strings. You have to prefix the opening quote with 'ur' to have
Python use the <em>Raw-Unicode-Escape</em> encoding. It will only apply
the above <code>&#92;uXXXX</code> conversion if there is an uneven number of
backslashes in front of the small 'u'.<P>另外，有一种与普通字符串相同的行模式。要使用 Python 的 Raw-Unicode-Escape 编码，你需要在字符串的引号前加上 ur
前缀。如果在小写“u”前有不止一个反斜杠，它只会把那些单独的 <BR>uXXXX 转化为Unicode字符。<P><div class="verbatim"><pre>
&gt;&gt;&gt; ur'Hello\u0020World !'
u'Hello World !'
&gt;&gt;&gt; ur'Hello\\u0020World !'
u'Hello\\\\u0020World !'</pre></div><P>The raw mode is most useful when you have to enter lots of
backslashes, as can be necessary in regular expressions.<P>行模式在你需要输入很多个反斜杠时很有用，使用正则表达式时会带来方便。<P>Apart from these standard encodings, Python provides a whole set of
other ways of creating Unicode strings on the basis of a known
encoding.<P>作为这些编码标准的一部分， Python 提供了一个完备的方法集用于从已知的编码集创建 Unicode 字符串。<P>The built-in function <tt class="function">unicode()</tt><a id='l2h-4' xml:id='l2h-4'></a> provides
access to all registered Unicode codecs (COders and DECoders). Some of
the more well known encodings which these codecs can convert are
<em>Latin-1</em>, <em>ASCII</em>, <em>UTF-8</em>, and <em>UTF-16</em>.
The latter two are variable-length encodings that store each Unicode
character in one or more bytes. The default encoding is
normally set to ASCII, which passes through characters in the range
0 to 127 and rejects any other characters with an error.
When a Unicode string is printed, written to a file, or converted
with <tt class="function">str()</tt>, conversion takes place using this default encoding.<P>内置函数 unicode() 提供了访问（编码和解码）所有已注册的 Unicode 编码的方法。它能转换众所周知的 Latin-1, ASCII, UTF-8, 和
UTF-16。后面的两个可变长编码字符集用一个或多个 byte 存储 Unicode 字符。默认的字符集是 ASCII，它只处理0到127的编码，拒绝其它的字符并返回一个错误。当一个 Unicode 字符串被打印、写入文件或通过 str() 转化时，它们被替换为默认的编码。<P><div class="verbatim"><pre>
&gt;&gt;&gt; u"abc"
u'abc'
&gt;&gt;&gt; str(u"abc")
'abc'
&gt;&gt;&gt; u"漩
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -