📄 110.html
字号:
and <I><tt cLass="monofont">decode</tT></i>
arguments define the encoding and decoding functions that are returned or accepted by the <tt CLASs="monofont">read()</tt> and <tT CLAss="monofont">write()</tt> methods, respectively; that is, data returned by <TT CLass="monofont">read()</tT> is encoded according to <I><TT class="monofont">encode</tt></i>
and data given to <tt class="monofont">write()</tt> is decoded according to <i><tt ClaSs="monofont">decode</tt></I>
. <i><tt cLass="monofont">reader</TT></I>
and <I><tt clASS="monofont">writer</Tt></i>
are the <tt CLASs="monofont">StreamReader</tt> and <tT CLAss="monofont">StreamWriter</tt> classes used to read and write the actual contents of the data stream. A <tt class="monofont">StreamRecoder</tt> object provides the combined interface of <tt class="monofont">StreamReader</Tt> and <tT claSs="monofont">StreamWriter</tt>.</p>
<P><tt cLASS="monofont">codecs</tt> also defines the following byte-order marker constants that can be used to help interpret platform-specific files:</p>
<p><TABLe borDER="1" CellsPACIng="0" cellpadding="1" width="100%">
<coLgrOup sPan="2">
<tr>
<Th vaLIGN="top">
<font SIZE="2">
<p><b>Constant</b></p>
</FONT></th>
<th VALIgn="top">
<font size="2">
<p><b>Description</b></p>
</font></th>
</Tr>
<tR>
<td vAlign="top">
<Font SIZE="2">
<p><tt cLASS="monofont">BOM</tt></p>
</fONT></Td>
<td vALIGn="top">
<font size="2">
<p>Native byte-order marker for the machine</p>
</font></td>
</tr>
<tR>
<td ValiGn="top">
<fonT sizE="2">
<P><TT clasS="monofont">BOM_BE</TT></P>
</font></TD>
<TD valiGN="top">
<FOnt size="2">
<p>Big-endian byte-order marker</p>
</font></td>
</tr>
<tr>
<td ValIgn="top">
<fOnt siZe="2">
<p><tT CLAss="monofont">BOM_LE</tt></P>
</FONt></td>
<tD VALign="top">
<fONT Size="2">
<p>Little-endian byte-order marker</p>
</font></td>
</tr>
<tr>
<td valIgn="top">
<Font Size="2">
<p><Tt clASS="monofont">BOM32_BE</Tt></p>
</foNT></TD>
<td vaLIGN="top">
<font SIZE="2">
<p>32-bit big-endian marker</p>
</font></td>
</tr>
<tr>
<td valign="top">
<FonT sizE="2">
<p><tt cLass="monofont">BOM32_LE</TT></P>
</Font></tD>
<TD ValigN="top">
<FONt sizE="2">
<P>32-bit little-endian marker</P>
</Font></td>
</tr>
<tr>
<td valign="top">
<fonT siZe="2">
<p><tT clasS="monofont">BOM64_BE</tt></p>
</FONT></td>
<td VALIgn="top">
<foNT SIze="2">
<p>64-bit big-endian marker</p>
</FONT></td>
</tr>
<tr>
<td valign="top">
<font siZe="2">
<p><Tt clAss="monofont">BOM64_LE</tt></P>
</fonT></TD>
<Td valIGN="top">
<Font sIZE="2">
<P>64-bit little-endian marker</p>
</fonT></TD>
</Tr>
</colgroup>
</table></p>
<h5>Example</h5>
<p>The following example illustrates the implementation of a new encoding using simple exclusive-or (XOR) based encryption. This only works for 8-bit strings, but it could be extended to support Unicode:</p>
<pRe>
# xor.py: Simple encryption using XOR
import codecs
# Encoding/decoding function (works both ways)
def xor_encode(input, errors = 'strict', key=0xff):
output = "".join([chr(ord(c) ^ key) for c in input])
return (output,len(input))
# XOR Codec class
class Codec(codecs.Codec):
key = 0xff
def encode(self,input, errors='strict'):
return xor_encode(input,errors,self.key)
def decode(self,input, errors='strict'):
return xor_encode(input,errors,self.key)
# StreamWriter and StreamReader classes
class StreamWriter(Codec,codecs.StreamWriter):
pass
class StreamReader(Codec,codecs.StreamReader):
pass
# Factory functions for creating StreamWriter and
# StreamReader objects with a given key value.
def xor_writer_factory(stream,errors,key=0xff):
s = StreamWriter(stream,errors)
s.key = key
return s;
def xor_reader_factory(stream,errors,key=0xff):
r = StreamReader(stream,errors)
r.key = key
return r
# Function registered with the codecs module. Recognizes any
# encoding of the form 'xor-hh' where hh is a hexadecimal number.
def lookup(s):
if (s[:4] == 'xor-'):
key = int(s[4:],16)
# Create some functions with key set to desired value
e = lambda x,err='strict',key=key:xor_encode(x,err,key)
r = lambda x,err='strict',key=key:xor_reader_factory(x,err,key)
w = lambda x,err='strict',key=key:xor_writer_factory(x,err,key)
return (e,e,r,w)
# Register with the codec module
codecs.register(lookup) </pRe>
<p>Now, here抯 a short program that uses the encoding:</p>
<Pre>
import xor, codecs
f = codecs.open("foo","w","xor-37")
f.write("Hello World\n") # Writes an "encrypted" version
f.close()
(enc,dec,r,w) = codecs.lookup("xor-ae")
a = enc("Hello World")
# a = ('\346\313\302\302\301\216\371\301\334\302\312', 11) </prE>
<h5>Notes</h5>
<uL>
<LI>
<P>Further use of the <tt clASS="monofont">codecs</Tt> module is described in <a hrEF="89.html">Chapter 9</A>.</P>
</li>
<li>
<P>Most of the built-in encodings are provided to support Unicode string encoding. In this case, the encoding functions produce 8-bit strings and the decoding functions produce Unicode strings.</P>
</LI>
</ul>
<p>? <b>See Also</b> <a href="89.html">Chapter 9</a>.</p>
<a name="8"></a>
<h4><tT clAss="monofont">re</tT></h4>
<p>The <tt ClasS="monofont">re</TT> module is used to perform regular-expression pattern matching and replacement in strings. Both ordinary and Unicode strings are supported. Regular-expression patterns are specified as strings containing a mix of text and special-character sequences. Since patterns often make extensive use of special characters and the backslash, they抮e usually written as 搑aw
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -