📄 node377.html

📁 一本很好的python的说明书,适合对python感兴趣的人
💻 HTML
📖 第 1 页 / 共 2 页
字号:
上一页 12
module docstring from the parse tree created previously is easy:

<P>
<dl><dd><pre class="verbatim">
&gt;&gt;&gt; found, vars = match(DOCSTRING_STMT_PATTERN, tup[1])
&gt;&gt;&gt; found
1
&gt;&gt;&gt; vars
{'docstring': '"""Some documentation.\012"""'}
</pre></dl>

<P>
Once specific data can be extracted from a location where it is
expected, the question of where information can be expected
needs to be answered.  When dealing with docstrings, the answer is
fairly simple: the docstring is the first <tt class="constant">stmt</tt> node in a code
block (<tt class="constant">file_input</tt> or <tt class="constant">suite</tt> node types).  A module
consists of a single <tt class="constant">file_input</tt> node, and class and function
definitions each contain exactly one <tt class="constant">suite</tt> node.  Classes and
functions are readily identified as subtrees of code block nodes which
start with <code>(stmt, (compound_stmt, (classdef, ...</code> or
<code>(stmt, (compound_stmt, (funcdef, ...</code>.  Note that these subtrees
cannot be matched by <tt class="function">match()</tt> since it does not support multiple
sibling nodes to match without regard to number.  A more elaborate
matching function could be used to overcome this limitation, but this
is sufficient for the example.

<P>
Given the ability to determine whether a statement might be a
docstring and extract the actual string from the statement, some work
needs to be performed to walk the parse tree for an entire module and
extract information about the names defined in each context of the
module and associate any docstrings with the names.  The code to
perform this work is not complicated, but bears some explanation.

<P>
The public interface to the classes is straightforward and should
probably be somewhat more flexible.  Each ``major'' block of the
module is described by an object providing several methods for inquiry
and a constructor which accepts at least the subtree of the complete
parse tree which it represents.  The <tt class="class">ModuleInfo</tt> constructor
accepts an optional <var>name</var> parameter since it cannot
otherwise determine the name of the module.

<P>
The public classes include <tt class="class">ClassInfo</tt>, <tt class="class">FunctionInfo</tt>,
and <tt class="class">ModuleInfo</tt>.  All objects provide the
methods <tt class="method">get_name()</tt>, <tt class="method">get_docstring()</tt>,
<tt class="method">get_class_names()</tt>, and <tt class="method">get_class_info()</tt>.  The
<tt class="class">ClassInfo</tt> objects support <tt class="method">get_method_names()</tt> and
<tt class="method">get_method_info()</tt> while the other classes provide
<tt class="method">get_function_names()</tt> and <tt class="method">get_function_info()</tt>.

<P>
Within each of the forms of code block that the public classes
represent, most of the required information is in the same form and is
accessed in the same way, with classes having the distinction that
functions defined at the top level are referred to as ``methods.''
Since the difference in nomenclature reflects a real semantic
distinction from functions defined outside of a class, the
implementation needs to maintain the distinction.
Hence, most of the functionality of the public classes can be
implemented in a common base class, <tt class="class">SuiteInfoBase</tt>, with the
accessors for function and method information provided elsewhere.
Note that there is only one class which represents function and method
information; this parallels the use of the <tt class="keyword">def</tt> statement to
define both types of elements.

<P>
Most of the accessor functions are declared in <tt class="class">SuiteInfoBase</tt>
and do not need to be overridden by subclasses.  More importantly, the
extraction of most information from a parse tree is handled through a
method called by the <tt class="class">SuiteInfoBase</tt> constructor.  The example
code for most of the classes is clear when read alongside the formal
grammar, but the method which recursively creates new information
objects requires further examination.  Here is the relevant part of
the <tt class="class">SuiteInfoBase</tt> definition from <span class="file">example.py</span>:

<P>
<dl><dd><pre class="verbatim">
class SuiteInfoBase:
    _docstring = ''
    _name = ''

    def __init__(self, tree = None):
        self._class_info = {}
        self._function_info = {}
        if tree:
            self._extract_info(tree)

    def _extract_info(self, tree):
        # extract docstring
        if len(tree) == 2:
            found, vars = match(DOCSTRING_STMT_PATTERN[1], tree[1])
        else:
            found, vars = match(DOCSTRING_STMT_PATTERN, tree[3])
        if found:
            self._docstring = eval(vars['docstring'])
        # discover inner definitions
        for node in tree[1:]:
            found, vars = match(COMPOUND_STMT_PATTERN, node)
            if found:
                cstmt = vars['compound']
                if cstmt[0] == symbol.funcdef:
                    name = cstmt[2][1]
                    self._function_info[name] = FunctionInfo(cstmt)
                elif cstmt[0] == symbol.classdef:
                    name = cstmt[2][1]
                    self._class_info[name] = ClassInfo(cstmt)
</pre></dl>

<P>
After initializing some internal state, the constructor calls the
<tt class="method">_extract_info()</tt> method.  This method performs the bulk of the
information extraction which takes place in the entire example.  The
extraction has two distinct phases: the location of the docstring for
the parse tree passed in, and the discovery of additional definitions
within the code block represented by the parse tree.

<P>
The initial <tt class="keyword">if</tt> test determines whether the nested suite is of
the ``short form'' or the ``long form.''  The short form is used when
the code block is on the same line as the definition of the code
block, as in

<P>
<dl><dd><pre class="verbatim">
def square(x): "Square an argument."; return x ** 2
</pre></dl>

<P>
while the long form uses an indented block and allows nested
definitions:

<P>
<dl><dd><pre class="verbatim">
def make_power(exp):
    "Make a function that raises an argument to the exponent `exp'."
    def raiser(x, y=exp):
        return x ** y
    return raiser
</pre></dl>

<P>
When the short form is used, the code block may contain a docstring as
the first, and possibly only, <tt class="constant">small_stmt</tt> element.  The
extraction of such a docstring is slightly different and requires only
a portion of the complete pattern used in the more common case.  As
implemented, the docstring will only be found if there is only
one <tt class="constant">small_stmt</tt> node in the <tt class="constant">simple_stmt</tt> node.
Since most functions and methods which use the short form do not
provide a docstring, this may be considered sufficient.  The
extraction of the docstring proceeds using the <tt class="function">match()</tt> function
as described above, and the value of the docstring is stored as an
attribute of the <tt class="class">SuiteInfoBase</tt> object.

<P>
After docstring extraction, a simple definition discovery
algorithm operates on the <tt class="constant">stmt</tt> nodes of the
<tt class="constant">suite</tt> node.  The special case of the short form is not
tested; since there are no <tt class="constant">stmt</tt> nodes in the short form,
the algorithm will silently skip the single <tt class="constant">simple_stmt</tt>
node and correctly not discover any nested definitions.

<P>
Each statement in the code block is categorized as
a class definition, function or method definition, or
something else.  For the definition statements, the name of the
element defined is extracted and a representation object
appropriate to the definition is created with the defining subtree
passed as an argument to the constructor.  The representation objects
are stored in instance variables and may be retrieved by name using
the appropriate accessor methods.

<P>
The public classes provide any accessors required which are more
specific than those provided by the <tt class="class">SuiteInfoBase</tt> class, but
the real extraction algorithm remains common to all forms of code
blocks.  A high-level function can be used to extract the complete set
of information from a source file.  (See file <span class="file">example.py</span>.)

<P>
<dl><dd><pre class="verbatim">
def get_docs(fileName):
    import os
    import parser

    source = open(fileName).read()
    basename = os.path.basename(os.path.splitext(fileName)[0])
    ast = parser.suite(source)
    return ModuleInfo(ast.totuple(), basename)
</pre></dl>

<P>
This provides an easy-to-use interface to the documentation of a
module.  If information is required which is not extracted by the code
of this example, the code may be extended at clearly defined points to
provide additional capabilities.

<DIV CLASS="navigation"><p><hr><table align="center" width="100%" cellpadding="0" cellspacing="2">
<tr>
<td><A HREF="node376.html" tppabs="http://www.python.org/doc/current/lib/node376.html"><img src="previous.gif" tppabs="http://www.python.org/doc/current/icons/previous.gif" border="0" height="32"
  alt="Previous Page" width="32"></A></td>
<td><A href="AST_Examples.html" tppabs="http://www.python.org/doc/current/lib/AST_Examples.html"><img src="up.gif" tppabs="http://www.python.org/doc/current/icons/up.gif" border="0" height="32"
  alt="Up One Level" width="32"></A></td>
<td><A href="module-symbol.html" tppabs="http://www.python.org/doc/current/lib/module-symbol.html"><img src="next.gif" tppabs="http://www.python.org/doc/current/icons/next.gif" border="0" height="32"
  alt="Next Page" width="32"></A></td>
<td align="center" width="100%">Python Library Reference</td>
<td><A href="contents.html" tppabs="http://www.python.org/doc/current/lib/contents.html"><img src="contents.gif" tppabs="http://www.python.org/doc/current/icons/contents.gif" border="0" height="32"
  alt="Contents" width="32"></A></td>
<td><a href="modindex.html" tppabs="http://www.python.org/doc/current/lib/modindex.html" title="Module Index"><img src="modules.gif" tppabs="http://www.python.org/doc/current/icons/modules.gif" border="0" height="32"
  alt="Module Index" width="32"></a></td>
<td><A href="genindex.html" tppabs="http://www.python.org/doc/current/lib/genindex.html"><img src="index.gif" tppabs="http://www.python.org/doc/current/icons/index.gif" border="0" height="32"
  alt="Index" width="32"></A></td>
</tr></table>
<b class="navlabel">Previous:</b> <a class="sectref" HREF="node376.html" tppabs="http://www.python.org/doc/current/lib/node376.html">17.1.6.1 Emulation of compile()</A>
<b class="navlabel">Up:</b> <a class="sectref" href="AST_Examples.html" tppabs="http://www.python.org/doc/current/lib/AST_Examples.html">17.1.6 Examples</A>
<b class="navlabel">Next:</b> <a class="sectref" href="module-symbol.html" tppabs="http://www.python.org/doc/current/lib/module-symbol.html">17.2 symbol  </A>
</DIV>
<!--End of Navigation Panel-->
<ADDRESS>
<hr>See <i><a href="about.html" tppabs="http://www.python.org/doc/current/lib/about.html">About this document...</a></i> for information on suggesting changes.
</ADDRESS>
</BODY>
</HTML>
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -