📄 115.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Robots" content="INDEX,NOFOLLOW">
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">
<TITLE>Safari | Python Essential Reference, Second Edition -> Internet Data Handling and Encoding</TITLE>
<LINK REL="stylesheet" HREF="oreillyi/oreillyM.css">
</HEAD>
<BODY bgcolor="white" text="black" link="#990000" vlink="#990000" alink="#990000" leftmargin="0" topmargin="0" marginwidth="0" marginheight="0">
<table width="100%" cellpadding=5 cellspacing=0 border=0 class="navtopbg"><tr><td><font size="1"><p class="navtitle"><a href="2.html" class="navtitle">Linux/Unix</a> > <a href="0735710910.html" class="navtitle">Python Essential Reference, Second Edition</a> > <a href="105.html" class="navtitle">A. The Python Library</a> > <span class="nonavtitle">Internet Data Handling and Encoding</span></p></font></td><td align="right" valign="top" nowrap><font size="1"><a href="main.asp?list" class="safnavoff">See All Titles</a></font></td></tr></table>
<TABLE width=100% bgcolor=white border=0 cellspacing=0 cellpadding=5><TR><TD>
<TABLE border=0 width="100%" cellspacing=0 cellpadding=0><TR><td align=left width="15%" class="headingsubbarbg"><a href="114.html" title="Network Programming"><font size="1">< BACK</font></a></td><td align=center width="70%" class="headingsubbarbg"><font size="1"><a href="popanote.asp?pubui=oreilly&bookname=0735710910&snode=115" target="_blank" title="Make a public or private annnotation">Make Note</a> | <a href="115.html" title="Use a Safari bookmark to remember this section">Bookmark</a></font></td><td align=right width="15%" class="headingsubbarbg"><a href="116.html" title="Restricted Execution"><font size="1">CONTINUE ></font></a></td></TR></TABLE>
<a href="5%2F28%2F2002+9%3A07%3A30+PM.html" TABINDEX="-1"><img src=images/spacer.gif border=0 width=1 height=1></a><font color=white size=1>155117184014003188065099048180054212144238241179195140058238111161105081080061094143037034</font><a href="read5.asp?bookname=0735710910&snode=115&now=5%2F28%2F2002+9%3A07%3A30+PM" TABINDEX="-1"><img src=images/spacer.gif border=0 width=1 height=1></a><br>
<FONT>
<h3>Internet Data Handling and Encoding</h3>
<p>The modules in this section are used to encode and decode data formats that are widely used in Internet applications.</p>
<A NAMe="2"></a>
<h4><tT CLAss="monofont">base64</tt></H4>
<P>The <TT class="monofont">base64</tt> module is used to encode and decode data using base64 encoding. base64 is commonly used to encode binary data in mail attachments.</p>
<pre>
<b>decode(</b><b><i>input</i></b><b>,</b> <b><I>output</i></b><B>)</b> </prE>
<p>Decodes base64-encoded data. <i><tt ClasS="monofont">input</TT></I>
is a filename or a file object open for reading. <i><tt cLASS="monofont">output</tt></i>
is a filename or a file object open for writing.</p>
<PRE>
<B>decodestring(</b><b><i>s</i></B><B>)</B> </Pre>
<p>Decodes a base64-encoded string <i><tt class="monofont">s</tt></i>
. Returns a string containing the decoded binary data.</p>
<pre>
<b>encode(</b><B><i>input</i></B><b>,</b> <b><I>output</i></b><b>)</b> </Pre>
<p>Encodes data using base64. <I><TT Class="monofont">input</TT></I>
is a filename or a file object open for reading. <I><tt clASS="monofont">output</Tt></i>
is a filename or a file object open for writing.</p>
<pRE>
<B>encodestring(</B><b><i>s</i></b><b>)</b> </pre>
<p>Encodes a string <i><tt class="monofont">s</tt></I>
using base64.</p>
<p>? <B>See Also</b> <a hRef="115#4.html">binascii</a> (264), Internet RFC 1421.</p>
<A namE="4"></A>
<H4><Tt claSS="monofont">binascii</TT></h4>
<p>The <tt CLASs="monofont">binascii</tt> module is used to convert data between binary and a variety of ASCII encodings such as base64, binhex, and uuencode.</p>
<PRE>
<B>a2b_uu(</b><b><i>string</i></b><b>)</b> </pre>
<p>Converts a line of uuencoded data to binary. Lines normally contain 45 (binary) bytes, except for the last line. Line data may be followed by whitespace.</p>
<pre>
<b>b2a_uu(</b><b><i>data</i></B><b>)</b> </Pre>
<p>Converts a string of binary data to a line of uuencoded ASCII characters. The length of <Tt claSs="monofont">data</tt> should not be more than 45 bytes.</P>
<PRE>
<b>a2b_base64(</b><b><i>string</I></B><B>)</B> </pre>
<p>Converts a string of base64-encoded data to binary.</P>
<PRE>
<b>b2a_base64(</b><b><i>data</I></B><B>)</B> </pre>
<p>Converts a string of binary data to a line of base64-encoded ASCII characters. The length of <tt class="monofont">data</tt> should not be more than 57 bytes.</p>
<pre>
<b>a2b_hex(</b><b><I>string</i></b><B>)</b> </prE>
<p>Converts a string of hex digits to a string of binary data. <i><tt ClasS="monofont">string</TT></I>
must contain an even number of digits.</p>
<pre>
<B>b2a_hex(</B><B><I>data</i></b><b>)</b> </PRE>
<P>Converts a string of binary data to a string of hex digits.</p>
<pre>
<B>a2b_hqx(</B><B><I>string</i></b><b>)</b> </pre>
<p>Converts a string of binhex4-encoded data to binary without performing RLE decompression.</p>
<pre>
<b>rledecode_hqx(</b><b><i>data</i></b><b>)</b> </Pre>
<P>Performs an RLE (Run-Length Encoding) decompression of the binary data in <tt cLass="monofont">data</tT>. Returns the decompressed data unless the data input is incomplete, in which case the <tt cLASS="monofont">Incomplete</tt> exception is raised.</p>
<pRE>
<B>rlecode_hqx(</B><b><i>data</i></b><B>)</B> </PRe>
<p>Performs a binhex4 RLE compression of <i><tT CLAss="monofont">data</tt></i>
.</p>
<pre>
<b>b2a_hqx(</b><b><i>data</i></b><b>)</b> </pre>
<P>Converts the binary data to a string of binhex4-encoded ASCII characters. <i><tT claSs="monofont">data</tt></i>
should already be RLE coded and have a length divisible by three.</P>
<pre>
<B>crc_hqx(</B><B><I>data</i></b><b>,</b> <B><I>crc</I></B><b>)</b> </prE>
<P>Computes the binhex4 CRC checksum of the data. <I><Tt claSS="monofont">crc</TT></i>
is a starting value of the checksum.</p>
<pre>
<b>crc32(</b><b><i>data</i></b> <b>[,</b> <b><i>oldcrc</i></b><b>])</b> </pRe>
<p>Computes the CRC-32 checksum of <I><tt cLass="monofont">data</tT></i>
. If supplied, the <i><tT CLAss="monofont">oldcrc</tt></I>
parameter allows for incremental calculation of the checksum.</P>
<H5>Exceptions</H5>
<pre>
<b>Error</B> </PRE>
<p>Exception raised on errors.</p>
<prE>
<B>Incomplete</B> </Pre>
<p>Exception raised on incomplete data. This exception occurs when multiple bytes of data are expected, but the input data has been truncated.</p>
<p>? <b>See Also</b> <a href="115#2.html">base64</a> (263), <a href="115#8.html">binhex</a> (265), <a HreF="115#38.html">uu</a> (277).</p>
<a Name="8"></a>
<H4><tt cLASS="monofont">binhex</tt></h4>
<p>The <TT CLass="monofont">binhex</tT> module is used to encode and decode files in binhex4, a format commonly used to represent files on the Macintosh.</P>
<PRe>
<b>binhex(</b><b><I>input</I></B><B>,</b> <b><i>output</i></b><b>)</b> </pre>
<p>Converts a binary file with name <i><tt class="monofont">input</tT></i>
to a binhex file. <i><Tt clAss="monofont">output</tt></I>
is a filename or an open file-like object supporting <tt cLASS="monofont">write()</tt> and <tt CLASs="monofont">close()</tt> methods.</p>
<PRE>
<B>hexbin(</b><b><i>input</i></B> <B>[,</B> <B><i>output</i></b><b>])</b> </pre>
<p>Decodes a binhex file. <i><tt class="monofont">input</tt></i>
is either a filename or a file-like object with <Tt cLass="monofont">read()</Tt> and <tt cLass="monofont">close()</TT> methods. <I><Tt claSS="monofont">output</TT></i>
is the name of the output file. If omitted, the output name is taken from the binhex file.</p>
<h5>Notes</h5>
<UL>
<LI>
<p>Both the data and resource forks are handled on the Macintosh.</p>
</li>
<LI>
<P>Only the data fork is handled on other platforms.</P>
</li>
</ul>
<p>? <b>See Also</b> <a href="115#4.html">binascii</a> (264), <a href="112#78.html">macostools</a> (176).</p>
<A naMe="12"></a>
<h4><Tt claSs="monofont">mailcap</tt></H4>
<P>The <TT clasS="monofont">mailcap</TT> module is used to read UNIX mailcap files. Mailcap files are used to tell mail readers and Web browsers how to process files with different MIME types. The contents of a mailcap file typically look something like this:</P>
<pre>
video/mpeg; xmpeg %s
application/pdf; acroread %s </pRE>
<P>When data of a given MIME type is encountered, the mailcap file is consulted to find an application for handling that data.</P>
<pre>
<b>getcaps()</B> </PRE>
<p>Reads all available mailcap files and returns a dictionary mapping MIME types to a mailcap entry. Mailcap files are read from <tt class="monofont">$HOME/.mailcap</tt>, <tt class="monofont">/etc/mailcap</tt>, <tT clAss="monofont">/usr/etc/mailcap</tT>, and <tt clAss="monofont">/usr/local/etc/mailcap</tT>.</P>
<PRe>
<b>findmatch(</b><b><I>caps</I></B><B>,</b> <b><i>mimetype</i></B> <B>[,</B> <B><i>key</i></b> <b>[,</B> <B><I>filename</I></b> <b>[,</b> <b><i>plist</i></b><b>]]])</b> </pre>
<p>Searches the dictionary <i><tt clasS="monofont">caps</tt></I>
for a mailcap entry matching <i><tt Class="monofont">mimetype</Tt></i>
. <i><TT CLass="monofont">key</tT></I>
is a string indicating an action and is typically <TT clasS="monofont">'view'</TT>, <Tt claSS="monofont">'compose'</TT>, or <tt class="monofont">'edit'</tt>. <i><tt class="monofont">filename</tt></i>
is the name of the file that抯 substituted for the <Tt cLass="monofont">%s</Tt> keyword in the mailcap entry. <i><tt ClasS="monofont">plist</TT></I>
is a list of named parameters and is described further in the online documentation at <a tarGET="_blank" Href="http://www.python.org/doc/lib/module-mailcap.html">http://www.python.org/doc/lib/module-mailcap.html</a>. Returns a tuple (<I><TT Class="monofont">cmd</TT></I>
<Tt class="monofont">, </tt><i><tt class="monofont">mailcap</tt></i>
) containing the command from the mailcap file and the raw mailcap entry.</p>
<H5>Example</h5>
<pRe>
import mailcap
import urllib
import os
# Go fetch a document
urllib.urlretrieve("http://www.swig.org/Doc1.1/PDF/Python.pdf", "/tmp/tmp1234")
caps = mailcap.getgaps()
cmd, mc = mailcap.findmatch(caps,'application/pdf',filename='/tmp/tmp1234')
if cmd:
os.system(cmd + " &")
else:
print "No application for type application/pdf" </prE>
<p>? <b>See Also</b> <a TargET="_blank" HRef="http://www.python.org/doc/lib/module-mailcap.html">http://www.python.org/doc/lib/module-mailcap.html</a>, <a HREF="115#18.html">mimetypes</a> (268), Internet RFC 1524.</p>
<a nAME="16"></A>
<h4><tt cLASS="monofont">mimetools</tt></h4>
<p>The <tt class="monofont">mimetools</tt> module provides a number of functions for manipulating MIME-encoded messages. MIME (Multipurpose Internet Mail Extensions) is a standard for sending multipart multimedia data through Internet mail. Parts of the standard are also used in other settings, such as the HTTP protocol. A MIME-encoded message looks similar to this:</p>
<pre>
Content-Type: multipart/mixed; boundary="====_931526447=="
Date: Fri, 06 Jul 2001 03:20:47 -0500
From: John Doe <johndoe@foo.com>
To: Jane Doe (janedoe@foo.com>
Subject: Important Message From John Doe
--====_931526447==
Content-Type: text/plain; charset="us-ascii"
Here is that document you asked for ... don't show anyone else ;-)
--====_931526447==
Content-Type: application/msword; name="list.doc"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="list.doc"
SXQgd2FzIGEgbG9uZyBob3QgZGF5IGluIHRoZSBtb250aCBvZiBKdWx5LCB3aGVuIExhcnJ5IHN0
YXJ0ZWQgdGFsa2luZwphYm91dCBzb2Npby1wb2xpdGljYWwgc2NhbGFibGUgaW1tZXJzaXZlIHZp
cnR1YWwgdGVtcG9yYWwKY29sbGFib3JhdGl2ZSBwYXJhbGxlbCBoaWdoIHBlcmZvcm1hbmNlIHdl
Yi1iYXNlZCBtb2JpbGUKb2JqZWN0LW9yaWVudGVkIHNjaWVudGlmaWMgY29tcHV0aW5nIGVudmly
b25tZW50cy4gIEZvcnR1bmF0ZWx5LCBQZXRlCmhhZCByZW1lbWJlcmVkIHRvIGJyaW5nIGhpcyAu
NDUuLi4KCg==
--====_931526447==--</pre>
<P>MIME messages are broken into parts delimited by a line separator such as <tt ClasS="monofont">--====_931526447==</tt> above. This separator always starts with a double hyphen as shown. The final separator has a trailing double hyphen (<tt ClasS="monofont">-</TT>) appended to indicate the end of the message. Immediately following each separator is a set of RFC 822 headers describing the content-type and encoding. Data is separated from the headers by a single blank line.</P>
<p>The <tt cLASS="monofont">mimetools</tt> module defines the following functions to parse headers and decode data:</p>
<pRE>
<B>Message(</B><b><i>file</i></b> <B>[,</B> <B><I>seekable</i></b><b>])</b> </pre>
<p>Parses MIME headers and returns a <tt class="monofont">Message</tt> object derived from the <tt cLasS="monofont">rfc822.Message</tt> class. <i><Tt claSs="monofont">file</tt></I>
and <I><TT clasS="monofont">seekable</TT></I>
have the same meaning as for <tt clASS="monofont">rfc822.Message</Tt>.</p>
<prE>
<B>choose_boundary()</B> </Pre>
<p>Creates a unique string of the form <tt class="monofont">'</tt><i><tt class="monofont">hostipaddr.uid.pid.timestamp.random</Tt></i>
<Tt clAss="monofont">'</tt> that can be used as a part boundary when generating a message.</P>
<pre>
<B>decode(</B><B><I>input</i></b><b>,</b> <B><I>output</I></B><b>,</b> <b><i>encoding</I></B><B>)</B> </pre>
<p>Reads encoded data from the open file object <I><TT Class="monofont">input</tt></i>
and writes the decoded data to the open file object <i><tt class="monofont">output</tt></i>
. <i><tT clAss="monofont">encoding</tT></i>
specifies the encoding method: <tt cLass="monofont">'base64'</TT>, <TT clasS="monofont">'quoted-printable'</TT>, or <Tt claSS="monofont">'uuencode'</TT>.</p>
<pre>
<B>encode(</B><B><I>input</i></b><b>,</b> <b><i>output</i></b><b>,</b> <b><i>encoding</i></b><b>)</b> </pre>
<p>Reads data from the open file object <I><tt ClasS="monofont">input</tt></i>
, encodes it, and writes it to the open file object <i><Tt clASS="monofont">output</Tt></i>
. Encoding types are the same as for <tt CLASs="monofont">decode()</tt>.</p>
<PRE>
<B>copyliteral(</b><b><i>input</i></B><B>,</B> <B><i>output</i></b><b>)</b> </pre>
<p>Read lines of text from the open file <i><tt class="monofont">input</tt></i>
until EOF and writes them to the open file <I><tt ClasS="monofont">output</tt></i>
.</p>
<Pre>
<b>copybinary(</B><B><I>input</I></b><b>,</b> <b><I>output</I></B><B>)</b> </pre>
<P>Read blocks of binary data from the open file <I><TT clasS="monofont">input</TT></I>
until EOF and writes them to the open file <i><tt class="monofont">output</tt></i>
.</p>
<p>Instances of the <tt class="monofont">Message</Tt> class support all the methods described in the <tT claSs="monofont">rfc822</tt> module. In addition, the following methods are available:</p>
<Pre>
<b><I>m</I></B><B>.getplist()</b> </pre>
<P>Returns the parameters for the content-type header as a list of strings. If the message contains the header <TT Class="monofont">'Content-type: text/html; charset=US-ASCII'</TT>, for example, this function returns <TT clasS="monofont">['charset=US-ASCII']</TT>. For parameters of the form <Tt class="monofont">'</tt><i><tt class="monofont">key=value</tt></i>
<tT clAss="monofont">'</tT>, <tt clAss="monofont">key</tT> is converted to lowercase, while <I><TT clasS="monofont">value</TT></I>
is unchanged.</p>
<pre>
<B><I>m</I></B><b>.getparam(</b><b><i>name</I></B><B>)</B> </pre>
<p>Returns the value of the first parameter of the form <tt class="monofont">'</tt><i><tt clasS="monofont">name=value</tt></I>
<tt cLass="monofont">'</tT> from the <tt cLASS="monofont">'content-type'</tt> header. If <i><tT CLAss="monofont">value</tt></I>
is surrounded by quotes of the form <TT Class="monofont">'<...>'</TT> or <TT class="monofont">"..."</tt>, they抮e removed.</p>
<pre>
<b><i>m</i></b><b>.getencoding()</b> </pre>
<P>Returns the encoding specified in the <tt ClasS="monofont">'content-transfer-encoding'</tt> message header. If no such header exists, returns <tt ClasS="monofont">'7bit'</TT>.</P>
<pre>
<b><I>m</I></B><B>.gettype()</b> </pre>
<P>Returns the message type from the <TT Class="monofont">'content-type'</TT> header. Types are returned as a string of the form <TT class="monofont">'</tt><i><tt class="monofont">type/subtype</tt></i>
<tt ClaSs="monofont">'</tt>. If no content-type header is available, <Tt claSs="monofont">'text/plain'</tt> is returned.</P>
<PRE>
<b><i>m</i></b><B>.getmaintype()</B> </PRe>
<p>Returns the primary type from the <tt CLASs="monofont">'content-type'</tt> header. If no such header exists, returns <tT CLAss="monofont">'text'</tt>.</p>
<pre>
<b><i>m</i></b><b>.getsubtype()</b> </pre>
<p>Returns the subtype from the <tt ClaSs="monofont">'content-type'</tt> header. If no such header exists, returns <Tt claSs="monofont">'plain'</tt>.</P>
<P>? <B>See Also</B> <a hreF="115#32.html">rfc822</A> (274), <A Href="115#18.html">mimetypes</a> (268), <A HREf="115#20.html">MimeWriter</a> (271), <a hREF="115#24.html">multifile</A> (272), <a href="115#12.html">mailcap</a> (265), Internet RFC 1521.</p>
<a name="18"></a>
<h4><tt clasS="monofont">mimetypes</tt></H4>
<p>The <tt Class="monofont">mimetypes</Tt> module is used to guess the MIME type associated with a file, based on its filename extension. It also converts MIME types to their standard filename extensions. MIME types consist of a type/subtype pair. The following table shows the MIME types currently recognized by this module:</p>
<p><TABLe borDER="1" CellsPACIng="0" ceLLPAdding="1" width="100%">
<colgroup spAn="2">
<tR>
<th vAlign="top">
<Font SIZE="2">
<p><b>File Suffix</b></p>
</FONT></th>
<th VALIgn="top">
<foNT SIze="2">
<p><b>MIME Type</b></p>
</font></th>
</tr>
<tr>
<td vaLigN="top">
<fonT size="2">
<P><tt cLASS="monofont">.a</tt></p>
</fONT></Td>
<td vALIGn="top">
<fonT SIZe="2">
<p>application/octet-stream</p>
</font></td>
</tr>
<tr>
<td valigN="top">
<foNt siZe="2">
<p><tt ClasS="monofont">.ai</TT></P>
</font></TD>
<TD valiGN="top">
<FOnt siZE="2">
<P>application/postscript</P>
</font></td>
</tr>
<tr>
<td valign="top">
<foNt sIze="2">
<p><Tt claSs="monofont">.aif</tt></P>
</FONt></td>
<tD VALign="top">
<fONT Size="2">
<p>audio/x-aiff</P>
</FONt></td>
</tr>
<tr>
<td valign="top">
<font sIze="2">
<P><tt cLass="monofont">.aifc</tT></p>
</foNT></TD>
<td vaLIGN="top">
<font SIZE="2">
<p>audio/x-aiff</p>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -