⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 rfc1866.txt

📁 RFC 的详细文档!
💻 TXT
📖 第 1 页 / 共 5 页
字号:
            element. [SGML]

    entity
            data with an associated notation or interpretation; for
            example, a sequence of octets associated with an
            Internet Media Type. [SGML]

    fragment identifier
            the portion of an HREF attribute value following the `#'
            character which modifies the presentation of the
            destination of a hyperlink.

    form data set
            a sequence of name/value pairs; the names are given by
            an HTML document and the values are given by a user.

    HTML document
            An SGML document conforming to this document type
            definition.

    hyperlink
            a relationship between two anchors, called the head and
            the tail. The link goes from the tail to the head. The
            head and tail are also known as destination and source,
            respectively.



Berners-Lee & Connolly      Standards Track                     [Page 7]

RFC 1866            Hypertext Markup Language - 2.0        November 1995


    markup
            Syntactically delimited characters added to the data of
            a document to represent its structure. There are four
            different kinds of markup: descriptive markup (tags),
            references, markup declarations, and processing
            instructions. [SGML]

    may
            A document or user interface is conforming whether this
            statement applies or not.

    media type
            an Internet Media Type, as per [IMEDIA].

    message entity
            a head and body. The head is a collection of name/value
            fields, and the body is a sequence of octets. The head
            defines the content type and content transfer encoding
            of the body. [MIME]

    minimally conforming
    HTML user agent
            A user agent that conforms to this specification except
            for form processing. It may only process level 1 HTML
            documents.

    must
            Documents or user agents in conflict with this statement
            are not conforming.

    numeric character
    reference
            markup that refers to a character by its code position
            in the document character set.

    SGML document
            A sequence of characters organized physically as a set
            of entities and logically into a hierarchy of elements.
            An SGML document consists of data characters and markup;
            the markup describes the structure of the information
            and an instance of that structure. [SGML]

    shall
            If a document or user agent conflicts with this
            statement, it does not conform to this specification.






Berners-Lee & Connolly      Standards Track                     [Page 8]

RFC 1866            Hypertext Markup Language - 2.0        November 1995


    should
            If a document or user agent conflicts with this
            statement, undesirable results may occur in practice
            even though it conforms to this specification.

    start-tag
            Descriptive markup that identifies the start of an
            element and specifies its generic identifier and
            attributes. [SGML]

    syntax-reference
    character set
            A coded character set whose range includes all
            characters used for markup; e.g. name characters and
            delimiter characters.

    tag
            Markup that delimits an element. A tag includes a name
            which refers to an element declaration in the DTD, and
            may include attributes. [SGML]

    text entity
            A finite sequence of characters. A text entity typically
            takes the form of a sequence of octets with some
            associated character encoding scheme, transmitted over
            the network or stored in a file. [SGML]

    typical
            Typical processing is described for many elements. This
            is not a mandatory part of the specification but is
            given as guidance for designers and to help explain the
            uses for which the elements were intended.

    URI
            A Uniform Resource Identifier is a formatted string that
            serves as an identifier for a resource, typically on the
            Internet. URIs are used in HTML to identify the anchors
            of hyperlinks. URIs in common practice include Uniform
            Resource Locators (URLs)[URL] and Relative URLs
            [RELURL].

    user agent
            A component of a distributed system that presents an
            interface and processes requests on behalf of a user;
            for example, a www browser or a mail user agent.






Berners-Lee & Connolly      Standards Track                     [Page 9]

RFC 1866            Hypertext Markup Language - 2.0        November 1995


    WWW
            The World-Wide Web is a hypertext-based, distributed
            information system created by researchers at CERN in
            Switzerland. <URL:http://www.w3.org/>

3. HTML as an Application of SGML

   HTML is an application of ISO 8879:1986 -- Standard Generalized
   Markup Language (SGML). SGML is a system for defining structured
   document types and markup languages to represent instances of those
   document types[SGML]. The public text -- DTD and SGML declaration --
   of the HTML document type definition are provided in 9, "HTML Public
   Text".

   The term "HTML" refers to both the document type defined here and the
   markup language for representing instances of this document type.

3.1. SGML Documents

   An HTML document is an SGML document; that is, a sequence of
   characters organized physically into a set of entities, and logically
   as a hierarchy of elements.

   In the SGML specification, the first production of the SGML syntax
   grammar separates an SGML document into three parts: an SGML
   declaration, a prologue, and an instance. For the purposes of this
   specification, the prologue is a DTD. This DTD describes another
   grammar: the start symbol is given in the doctype declaration, the
   terminals are data characters and tags, and the productions are
   determined by the element declarations. The instance must conform to
   the DTD, that is, it must be in the language defined by this grammar.

   The SGML declaration determines the lexicon of the grammar. It
   specifies the document character set, which determines a character
   repertoire that contains all characters that occur in all text
   entities in the document, and the code positions associated with
   those characters.

   The SGML declaration also specifies the syntax-reference character
   set of the document, and a few other parameters that bind the
   abstract syntax of SGML to a concrete syntax. This concrete syntax
   determines how the sequence of characters of the document is mapped
   to a sequence of terminals in the grammar of the prologue.








Berners-Lee & Connolly      Standards Track                    [Page 10]

RFC 1866            Hypertext Markup Language - 2.0        November 1995


   For example, consider the following document:

    <!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <title>Parsing Example</title>
    <p>Some text. <em>&#42;wow&#42;</em></p>

   An HTML user agent should use the SGML declaration that is given in
   9.5, "SGML Declaration for HTML". According to its document character
   set, `&#42;' refers to an asterisk character, `*'.

   The instance above is regarded as the following sequence of
   terminals:

        1. start-tag: TITLE

        2. data characters: "Parsing Example"

        3. end-tag: TITLE

        4. start-tag: P

        5. data characters "Some text."

        6. start-tag: EM

        7. data characters: "*wow*"

        8. end-tag: EM

        9. end-tag: P





















Berners-Lee & Connolly      Standards Track                    [Page 11]

RFC 1866            Hypertext Markup Language - 2.0        November 1995


   The start symbol of the DTD grammar is HTML, and the productions are
   given in the public text identified by `-//IETF//DTD HTML 2.0//EN'
   (9.1, "HTML DTD"). The terminals above parse as:

       HTML
        |
        \-HEAD
        |  |
        |  \-TITLE
        |      |
        |      \-<TITLE>
        |      |
        |      \-"Parsing Example"
        |      |
        |      \-</TITLE>
        |
        \-BODY
          |
          \-P
            |
            \-<P>
            |
            \-"Some text. "
            |
            \-EM
            |  |
            |  \-<EM>
            |  |
            |  \-"*wow*"
            |  |
            |  \-</EM>
            |
            \-</P>

   Some of the elements are delimited explicitly by tags, while the
   boundaries of others are inferred. The <HTML> element contains a
   <HEAD> element and a <BODY> element. The <HEAD> contains <TITLE>,
   which is explicitly delimited by start- and end-tags.

3.2. HTML Lexical Syntax

   SGML specifies an abstract syntax and a reference concrete syntax.
   Aside from certain quantities and capacities (e.g. the limit on the
   length of a name), all HTML documents use the reference concrete
   syntax. In particular, all markup characters are in the repertoire of
   [ISO-646]. Data characters are drawn from the document character set
   (see 6, "Characters, Words, and Paragraphs").




Berners-Lee & Connolly      Standards Track                    [Page 12]

RFC 1866            Hypertext Markup Language - 2.0        November 1995


   A complete discussion of SGML parsing, e.g. the mapping of a sequence
   of characters to a sequence of tags and data, is left to the SGML
   standard[SGML]. This section is only a summary.

3.2.1. Data Characters

   Any sequence of characters that do not constitute markup (see 9.6
   "Delimiter Recognition" of [SGML]) are mapped directly to strings of
   data characters. Some markup also maps to data character strings.
   Numeric character references map to single-character strings, via the
   document character set. Each reference to one of the general entities
   defined in the HTML DTD maps to a single-character string.

   For example,

    abc&lt;def    => "abc","<","def"
    abc&#60;def   => "abc","<","def"

   The terminating semicolon on entity or numeric character references
   is only necessary when the character following the reference would
   otherwise be recognized as part of the name (see 9.4.5 "Reference
   End" in [SGML]).

    abc &lt def     => "abc ","<"," def"
    abc &#60 def    => "abc ","<"," def"

   An ampersand is only recognized as markup when it is followed by a
   letter or a `#' and a digit:

    abc & lt def    => "abc & lt def"
    abc &# 60 def    => "abc &# 60 def"

   A useful technique for translating plain text to HTML is to replace
   each '<', '&', and '>' by an entity reference or numeric character
   reference as follows:

                     ENTITY      NUMERIC
           CHARACTER REFERENCE   CHAR REF     CHARACTER DESCRIPTION
           --------- ----------  -----------  ---------------------
             &       &amp;       &#38;        Ampersand
             <       &lt;        &#60;        Less than
             >       &gt;        &#62;        Greater than

        NOTE - There are SGML mechanisms, CDATA and RCDATA
        declared content, that allow most `<', `>', and `&'
        characters to be entered without the use of entity
        references. Because these mechanisms tend to be used and
        implemented inconsistently, and because they conflict



Berners-Lee & Connolly      Standards Track                    [Page 13]

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -