📄 xml-intro.txt

📁 python web programming 部分
💻 TXT
📖 第 1 页 / 共 3 页
字号:
上一页 1 23
    &gt;        >
    &amp;       &
    &quot;      "
    &apos;      '

By defining new internal entities, you can create new substitutions of your own.
For example:

    <!ENTITY ntilde "&#xf1">
    ...
    Jalape&ntilde;o pepper
    ...

When an entity refers to another file or location, it is known as an external general entity.
For example, if you composed a book in several parts.  You might do this:

<!DOCTYPE Book SYSTEM "book.dtd" [
   <!ENTITY chap1 SYSTEM "chap1.xml">
   <!ENTITY chap2 SYSTEM "chap2.xml">
   <!ENTITY chap3 SYSTEM "http://www.dead.com/book/draft/chap3.xml">
   ...
]>

<Book>
  ...

  &chap1;
  &chap2;
  &chap3;
  ...

</Book>

In this case, each entity is replaced by the entire contents of the
location specified in the DTD.  For example, &chap1; is replaced by
all of the text in the file chap1.xml.  When this substitution occurs,
it is exactly the same as if you inserted chap1.xml into the file at
that location.  

When specifying an external entity, the SYSTEM keyword is used to
specify a location on the local machine (a file or a URI).  For example:

   <!ENTITY chap1 SYSTEM "chap1.xml">

The name that is used is formally known as a system identifier.

In certain cases, an entity may use a formal public identifier (FPI).
This is most commonly used when referring to entities that are highly standardized
or in wide public use.  For example, if you were to create an XHTML document,
you would start with a declaration like this:

   <!DOCTYPE html
        PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

In this case, the PUBLIC keyword provides a formal name for the XHTML
DTD and a URI where the DTD can be obtained.  The reason for using a
public identifier is that XML processing tools might decide to create
local copies of commonly used public entities.  For instance, if a
copy of the XHTML DTD was already located on the local machine, an XML
processor could use the public identifier to perform a lookup and use
the local copy instead of actually fetching data from the URI given in
the <!DOCTYPE ... > declaration.

In some cases, an external entity refers to data that is not in XML
format.  For example, images, typesetting, and so forth.  This type of
entity is known as an external unparsed entity.   To create an
unparsed entity, you first need to include a <!NOTATION> declaration like this:

  <!DOCTYPE ... [
      <!NOTATION GIF SYSTEM "image/gif">
      ...
  ]>

The NOTATION declaration gives a name to a specific type of data.  In this case,
we are declaring the name "GIF" to refer to the MIME type "image/gif".   Next,
to declare an entity in this format, you use a declaration like this:

   <!ENTITY photo SYSTEM "photo.gif" NDATA GIF>

This declares the entity "photo" to be unparsed data of type GIF.  

Once declared, you can not simply include the entity in the usual
manner.  For example, it is illegal to do this:

    &photo;      <!-- Illegal: photo is unparsed data  -->

Instead, the entity be be associated with an element attribute like this:

   <!DOCTYPE ... [
       <!ELEMENT image EMPTY>
       <!ATTRLIST image name ENTITY #REQUIRED>
       ...
       <!NOTATION GIF SYSTEM "image/gif">
       ...
       <!ENTITY photo SYSTEM "photo.gif" NDATA GIF>
   ]>

Then the entity is included in a document as follows:

   <image name="photo"/>

In addition to general entities, XML allows entities to defined for
use in DTD definitions.  These are known as parameter entities and are
not described here since they are of little practical important to
the XML processing topics covered in the next two chapters.  Readers should
consult an XML book for details on this.

7. XML Namespaces
-----------------
An issue that sometimes arises when working with XML documents is how
to deal with conflicts between element names.  For example, two
entirely different XML documents may use similar element names
although those elements in each document mean different things.  In
isolation, this isn't a problem.  However, there are many situations
where it is desirable to embed XML content from one document
definition into another document.   For example, if you wanted to
allow HTML code to be embedded in our recipe code, you might consider
having a document like this:

<?xml version="1.0" encoding="utf-8"?>
<recipe>
   <title>Famous Guacamole</title>
   <description>
   A <em>southwest</em> favorite!
   </description>
   ...
</recipe>

However, this now raises a number of questions.  First, is the <em>
element used in in the description part of the recipe document or is
it HTML markup?  Also, how is <title> supposed to be handled?  Is that
HTML or is it part of a recipe?  Or is it both?  

To deal with this, XML provides support for namespaces.  A namespace
simply allows a set of document tags to be referenced by prependeding
them with a spefix prefix in order to distinguish them from other
tags.   To use a namespace, you simply include a namespace 
declaration using xmlns like this:

   <recipe xmlns:html="http://www.w3.org/1999/xhtml">

Now, in the document, if you wanted to use HTML tags, you would write

   <description>
   A <html:em>southwest</html:em> favorite!
   </description>

The general form a namespace declaration is :

   < ... xmlns:prefix="identifier" ...>

The prefix field is the name of the namespace that you will use in
your document.  The identifier is simply a unique identifier that is
the actual name of the namespace. Typically this is a URI (Uniform
Resource Locator) that refers to a DTD for the namespace.  However,
this is not a strict requirement (you can pick any unique name if you
really want).

The choice of a prefix is arbitrary.  For example, you could also
write:

   <recipe xmlns:HTML="http://www.w3.org/1999/xhtml">
   <description>
   A <HTML:em>southwest</HTML:em> favorite!
   </description>

or

   <recipe xmlns:h="http://www.w3.org/1999/xhtml">
   <description>
   A <h:em>southwest</h:em> favorite!
   </description>

It is also possible to include more than one namespace in a document. 
For example:

    <document xmlns:foo="http://www.dead.com/foo.dtd"
              xmlns:bar="http://www.alive.com/bar.dtd">

       <foo:section> 
           ...
       </foo:section>
       <bar:section>
       </bar:section>

Namespace declarations can also be attached to any document
element--not just the root node.  For example, in the recipe example, if you only
wanted HTML to be used in the description part, you might do this:

    <recipe>
    <description xmlns:html="http://www.w3.org/1999/xhtml">
    A <html:em>southwest</html:em> favorite!
    </description>

Documents also define a default namespace.  This is specified by omitting the
namespace prefix.  For example:

  <recipe xmlns="http://www.dead.com/recipe.dtd">
  <description>
  ...
  </description>
  ...
  </recipe>

When elements are put in the default namespace, the prefix can be omitted.  For example, no prefix
is attached to recipe elements.   If you wanted to flip the example around and make HTML the
default namespace, you might write this:

  <recipe:recipe xmlns:recipe="http://www.dead.com/recipe.dtd"
                 xmlns="http://www.w3.org/1999/xhtml">
  <recipe:description>
  A <em>southwest</em> favorite!
  </recipe:description>
  ...
  </recipe:recipe>

Finally, when working with namespace it is important to note that the
namespace prefix is also prepended to attribute names.  Therefore, if
you put a recipe into a recipe namespace as shown above, attributes
would be specified like this:

     <recipe:item recipe:num="1"> Jalapeno pepper, diced </recipe:item>

That's about it for XML namespaces.  For the most part it's nothing more than
prepending element and attribute names with a prefix.

8. Validating versus non-validating XML parsing
-----------------------------------------------

When processing XML documents, one often hears about "validating"
parsers and "non-validating" parsers.  This distinction primarily
pertains to the amount error checking performed by an XML parser.
Non-validating parsers only require that an XML document be well-formed.
An XML document is well-formed if it conforms to the following rules:

    -  There is only one root document element.
    -  All document elements have starting and ending tags.
    -  Document elements are nested properly.
    -  There are no unresolved parsed entities (e.g., all expansions
       of the form &name; can be resolved).

A validating parser requires that a document not only be well-formed,
but that it conforms to a DTD.  This means that all elements,
attributes, and entities must be formally defined in the DTD.  It also
means that a document elements must be properly structured according
to the DTD specification.

Although validating parsers can be useful for debugging and
verification, they are hard to write, slow, and generally not needed
in most XML applications.  Therefore, a lot of processing applications
really only require the use of a non-validating parser.

9. References
-------------

S. Holzner, "Inside XML", New Riders Publishing.

W3C, need URL.
上一页 1 23
💿 文件大小 1141 K
👤 上传用户 jill
📂 所属分类电子书籍
🏷️ 相关标签

#programming #python #web #分
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -