📄 states.py

📁 Requirement =====================================================================================
💻 PY
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页
# Author: David Goodger# Contact: goodger@users.sourceforge.net# Revision: $Revision: 4258 $# Date: $Date: 2006-01-09 04:29:23 +0100 (Mon, 09 Jan 2006) $# Copyright: This module has been placed in the public domain."""This is the ``docutils.parsers.restructuredtext.states`` module, the core ofthe reStructuredText parser.  It defines the following::Classes:    - `RSTStateMachine`: reStructuredText parser's entry point.    - `NestedStateMachine`: recursive StateMachine.    - `RSTState`: reStructuredText State superclass.    - `Inliner`: For parsing inline markup.    - `Body`: Generic classifier of the first line of a block.    - `SpecializedBody`: Superclass for compound element members.    - `BulletList`: Second and subsequent bullet_list list_items    - `DefinitionList`: Second+ definition_list_items.    - `EnumeratedList`: Second+ enumerated_list list_items.    - `FieldList`: Second+ fields.    - `OptionList`: Second+ option_list_items.    - `RFC2822List`: Second+ RFC2822-style fields.    - `ExtensionOptions`: Parses directive option fields.    - `Explicit`: Second+ explicit markup constructs.    - `SubstitutionDef`: For embedded directives in substitution definitions.    - `Text`: Classifier of second line of a text block.    - `SpecializedText`: Superclass for continuation lines of Text-variants.    - `Definition`: Second line of potential definition_list_item.    - `Line`: Second line of overlined section title or transition marker.    - `Struct`: An auxiliary collection class.:Exception classes:    - `MarkupError`    - `ParserError`    - `MarkupMismatch`:Functions:    - `escape2null()`: Return a string, escape-backslashes converted to nulls.    - `unescape()`: Return a string, nulls removed or restored to backslashes.:Attributes:    - `state_classes`: set of State classes used with `RSTStateMachine`.Parser Overview===============The reStructuredText parser is implemented as a recursive state machine,examining its input one line at a time.  To understand how the parser works,please first become familiar with the `docutils.statemachine` module.  In thedescription below, references are made to classes defined in this module;please see the individual classes for details.Parsing proceeds as follows:1. The state machine examines each line of input, checking each of the   transition patterns of the state `Body`, in order, looking for a match.   The implicit transitions (blank lines and indentation) are checked before   any others.  The 'text' transition is a catch-all (matches anything).2. The method associated with the matched transition pattern is called.   A. Some transition methods are self-contained, appending elements to the      document tree (`Body.doctest` parses a doctest block).  The parser's      current line index is advanced to the end of the element, and parsing      continues with step 1.   B. Other transition methods trigger the creation of a nested state machine,      whose job is to parse a compound construct ('indent' does a block quote,      'bullet' does a bullet list, 'overline' does a section [first checking      for a valid section header], etc.).      - In the case of lists and explicit markup, a one-off state machine is        created and run to parse contents of the first item.      - A new state machine is created and its initial state is set to the        appropriate specialized state (`BulletList` in the case of the        'bullet' transition; see `SpecializedBody` for more detail).  This        state machine is run to parse the compound element (or series of        explicit markup elements), and returns as soon as a non-member element        is encountered.  For example, the `BulletList` state machine ends as        soon as it encounters an element which is not a list item of that        bullet list.  The optional omission of inter-element blank lines is        enabled by this nested state machine.      - The current line index is advanced to the end of the elements parsed,        and parsing continues with step 1.   C. The result of the 'text' transition depends on the next line of text.      The current state is changed to `Text`, under which the second line is      examined.  If the second line is:      - Indented: The element is a definition list item, and parsing proceeds        similarly to step 2.B, using the `DefinitionList` state.      - A line of uniform punctuation characters: The element is a section        header; again, parsing proceeds as in step 2.B, and `Body` is still        used.      - Anything else: The element is a paragraph, which is examined for        inline markup and appended to the parent element.  Processing        continues with step 1."""__docformat__ = 'reStructuredText'import sysimport reimport romanfrom types import TupleTypefrom docutils import nodes, statemachine, utils, urischemesfrom docutils import ApplicationError, DataErrorfrom docutils.statemachine import StateMachineWS, StateWSfrom docutils.nodes import fully_normalize_name as normalize_namefrom docutils.nodes import whitespace_normalize_namefrom docutils.utils import escape2null, unescape, column_widthfrom docutils.parsers.rst import directives, languages, tableparser, rolesfrom docutils.parsers.rst.languages import en as _fallback_language_moduleclass MarkupError(DataError): passclass UnknownInterpretedRoleError(DataError): passclass InterpretedRoleNotImplementedError(DataError): passclass ParserError(ApplicationError): passclass MarkupMismatch(Exception): passclass Struct:    """Stores data attributes for dotted-attribute access."""    def __init__(self, **keywordargs):        self.__dict__.update(keywordargs)class RSTStateMachine(StateMachineWS):    """    reStructuredText's master StateMachine.    The entry point to reStructuredText parsing is the `run()` method.    """    def run(self, input_lines, document, input_offset=0, match_titles=1,            inliner=None):        """        Parse `input_lines` and modify the `document` node in place.        Extend `StateMachineWS.run()`: set up parse-global data and        run the StateMachine.        """        self.language = languages.get_language(            document.settings.language_code)        self.match_titles = match_titles        if inliner is None:            inliner = Inliner()        inliner.init_customizations(document.settings)        self.memo = Struct(document=document,                           reporter=document.reporter,                           language=self.language,                           title_styles=[],                           section_level=0,                           section_bubble_up_kludge=0,                           inliner=inliner)        self.document = document        self.attach_observer(document.note_source)        self.reporter = self.memo.reporter        self.node = document        results = StateMachineWS.run(self, input_lines, input_offset,                                     input_source=document['source'])        assert results == [], 'RSTStateMachine.run() results should be empty!'        self.node = self.memo = None    # remove unneeded referencesclass NestedStateMachine(StateMachineWS):    """    StateMachine run from within other StateMachine runs, to parse nested    document structures.    """    def run(self, input_lines, input_offset, memo, node, match_titles=1):        """        Parse `input_lines` and populate a `docutils.nodes.document` instance.        Extend `StateMachineWS.run()`: set up document-wide data.        """        self.match_titles = match_titles        self.memo = memo        self.document = memo.document        self.attach_observer(self.document.note_source)        self.reporter = memo.reporter        self.language = memo.language        self.node = node        results = StateMachineWS.run(self, input_lines, input_offset)        assert results == [], ('NestedStateMachine.run() results should be '                               'empty!')        return resultsclass RSTState(StateWS):    """    reStructuredText State superclass.    Contains methods used by all State subclasses.    """    nested_sm = NestedStateMachine    def __init__(self, state_machine, debug=0):        self.nested_sm_kwargs = {'state_classes': state_classes,                                 'initial_state': 'Body'}        StateWS.__init__(self, state_machine, debug)    def runtime_init(self):        StateWS.runtime_init(self)        memo = self.state_machine.memo        self.memo = memo        self.reporter = memo.reporter        self.inliner = memo.inliner        self.document = memo.document        self.parent = self.state_machine.node    def goto_line(self, abs_line_offset):        """        Jump to input line `abs_line_offset`, ignoring jumps past the end.        """        try:            self.state_machine.goto_line(abs_line_offset)        except EOFError:            pass    def no_match(self, context, transitions):        """        Override `StateWS.no_match` to generate a system message.        This code should never be run.        """        self.reporter.severe(            'Internal error: no transition pattern match.  State: "%s"; '            'transitions: %s; context: %s; current line: %r.'            % (self.__class__.__name__, transitions, context,               self.state_machine.line),            line=self.state_machine.abs_line_number())        return context, None, []    def bof(self, context):        """Called at beginning of file."""        return [], []    def nested_parse(self, block, input_offset, node, match_titles=0,                     state_machine_class=None, state_machine_kwargs=None):        """        Create a new StateMachine rooted at `node` and run it over the input        `block`.        """        if state_machine_class is None:            state_machine_class = self.nested_sm        if state_machine_kwargs is None:            state_machine_kwargs = self.nested_sm_kwargs        block_length = len(block)        state_machine = state_machine_class(debug=self.debug,                                            **state_machine_kwargs)        state_machine.run(block, input_offset, memo=self.memo,                          node=node, match_titles=match_titles)        state_machine.unlink()        new_offset = state_machine.abs_line_offset()        # No `block.parent` implies disconnected -- lines aren't in sync:        if block.parent and (len(block) - block_length) != 0:            # Adjustment for block if modified in nested parse:            self.state_machine.next_line(len(block) - block_length)        return new_offset    def nested_list_parse(self, block, input_offset, node, initial_state,                          blank_finish,                          blank_finish_state=None,                          extra_settings={},                          match_titles=0,                          state_machine_class=None,                          state_machine_kwargs=None):        """        Create a new StateMachine rooted at `node` and run it over the input        `block`. Also keep track of optional intermediate blank lines and the        required final one.        """        if state_machine_class is None:            state_machine_class = self.nested_sm        if state_machine_kwargs is None:            state_machine_kwargs = self.nested_sm_kwargs.copy()        state_machine_kwargs['initial_state'] = initial_state        state_machine = state_machine_class(debug=self.debug,                                            **state_machine_kwargs)        if blank_finish_state is None:            blank_finish_state = initial_state        state_machine.states[blank_finish_state].blank_finish = blank_finish        for key, value in extra_settings.items():            setattr(state_machine.states[initial_state], key, value)        state_machine.run(block, input_offset, memo=self.memo,                          node=node, match_titles=match_titles)        blank_finish = state_machine.states[blank_finish_state].blank_finish        state_machine.unlink()        return state_machine.abs_line_offset(), blank_finish    def section(self, title, source, style, lineno, messages):        """Check for a valid subsection and create one if it checks out."""
12 3 4 5 下一页
💿 文件大小 2065 K
👤 上传用户 dounob
📂 所属分类 Linux/Unix编程
🏷️ 相关标签

#Requirement
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -