📄 rfc2731.txt
字号:
Network Working Group J. Kunze
Request for Comments: 2731 Dublin Core
Category: Informational Metadata Initiative
December 1999
Encoding Dublin Core Metadata in HTML
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (1999). All Rights Reserved.
1. Abstract
The Dublin Core [DC1] is a small set of metadata elements for
describing information resources. This document explains how these
elements are expressed using the META and LINK tags of HTML
[HTML4.0]. A sequence of metadata elements embedded in an HTML file
is taken to be a description of that file. Examples illustrate
conventions allowing interoperation with current software that
indexes, displays, and manipulates metadata, such as [SWISH-E],
[freeWAIS-sf2.0], [GLIMPSE], [HARVEST], [ISEARCH], etc., and the Perl
[PERL] scripts in the appendix.
2. HTML, Dublin Core, and Non-Dublin Core Metadata
The Dublin Core (DC) metadata initiative [DCHOME] has produced a
small set of resource description categories [DC1], or elements of
metadata (literally, data about data). Metadata elements are
typically small relative to the resource they describe and may, if
the resource format permits, be embedded in it. Two such formats are
the Hypertext Markup Language (HTML) and the Extensible Markup
Language (XML); HTML is currently in wide use, but once standardized,
XML [XML] in conjunction with the Resource Description Framework
[RDF] promise a significantly more expressive means of encoding
metadata. The [RDF] specification actually describes a way to use
RDF within an HTML document by adhering to an abbreviated syntax.
Kunze Informational [Page 1]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
This document explains how to encode metadata using HTML 4.0
[HTML4.0]. It is not concerned with element semantics, which are
defined elsewhere. For illustrative purposes, some element semantics
are alluded to, but in no way should semantics appearing here be
considered definitive.
The HTML encoding allows elements of DC metadata to be interspersed
with non-DC elements (provided such mixing is consistent with rules
governing use of those non-DC elements). A DC element is indicated
by the prefix "DC", and a non-DC element by another prefix; for
example, the prefix "AC" is used with elements from the A-Core [AC].
3. The META Tag
The META tag of HTML is designed to encode a named metadata element.
Each element describes a given aspect of a document or other
information resource. For example, this tagged metadata element,
<meta name = "DC.Creator"
content = "Simpson, Homer">
says that Homer Simpson is the Creator, where the element named
Creator is defined in the DC element set. In the more general form,
<meta name = "PREFIX.ELEMENT_NAME"
content = "ELEMENT_VALUE">
the capitalized words are meant to be replaced in actual
descriptions; thus in the example,
ELEMENT_NAME was: Creator
ELEMENT_VALUE was: Simpson, Homer
and PREFIX was: DC
Within a META tag the first letter of a Dublin Core element name is
capitalized. DC places no restriction on alphabetic case in an
element value and any number of META tagged elements may appear
together, in any order. More than one DC element with the same name
may appear, and each DC element is optional. The next example is a
book description with two authors, two titles, and no other metadata.
<meta name = "DC.Title"
content = "The Communist Manifesto">
<meta name = "DC.Creator"
content = "Marx, K.">
Kunze Informational [Page 2]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
<meta name = "DC.Creator"
content = "Engels, F.">
<meta name = "DC.Title"
content = "Capital">
The prefix "DC" precedes each Dublin Core element encoded with META,
and it is separated by a period (.) from the element name following
it. Each non-DC element should be encoded with a prefix that can be
used to trace its origin and definition; the linkage between prefix
and element definition is made with the LINK tag, as explained in the
next section. Non-DC elements, such as Email from the A-Core [AC],
may appear together with DC elements, as in
<meta name = "DC.Creator"
content = "Da Costa, José">
<meta name = "AC.Email"
content = "dacostaj@peoplesmail.org">
<meta name = "DC.Title"
content = "Jesse "The Body" Ventura--A Biography">
This example also shows how some special characters may be encoded.
The author name in the first element contains a diacritic encoded as
an HTML character entity reference -- in this case an accented letter
E. Similarly, the last line contains two double-quote characters
encoded so as to avoid being interpreted as element content
delimiters.
4. The LINK Tag
The LINK tag of HTML may be used to associate an element name prefix
with the reference definition of the element set that it identifies.
A sequence of META tags describing a resource is incomplete without
one such LINK tag for each different prefix appearing in the
sequence. The previous example could be considered complete with the
addition of these two LINK tags:
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.0/">
<link rel = "schema.AC"
href = "http://metadata.net/ac/2.0/">
In general, the association takes the form
<link rel = "schema.PREFIX"
href = "LOCATION_OF_DEFINITION">
Kunze Informational [Page 3]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
where, in actual descriptions, PREFIX is to be replaced by the prefix
and LOCATION_OF_DEFINITION by the URL or URN of the defining
document. When embedded in the HEAD part of an HTML file, a sequence
of LINK and META tags describes the information in the surrounding
HTML file itself. Here is a complete HTML file with its own embedded
description.
<html>
<head>
<title> A Dirge </title>
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.0/">
<meta name = "DC.Title"
content = "A Dirge">
<meta name = "DC.Creator"
content = "Shelley, Percy Bysshe">
<meta name = "DC.Type"
content = "poem">
<meta name = "DC.Date"
content = "1820">
<meta name = "DC.Format"
content = "text/html">
<meta name = "DC.Language"
content = "en">
</head>
<body><pre>
Rough wind, that moanest loud
Grief too sad for song;
Wild wind, when sullen cloud
Knells all the night long;
Sad storm, whose tears are vain,
Bare woods, whose branches strain,
Deep caves and dreary main, -
Wail, for the world's wrong!
</pre></body>
</html>
5. Encoding Recommendations
HTML allows more flexibility in principle and in practice than is
recommended here for encoding metadata. Limited flexibility
encourages easy development of software for extracting and processing
metadata. At this early evolutionary stage of internet metadata,
easy prototyping and experimentation hastens the development of
useful standards.
Kunze Informational [Page 4]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
Adherence is therefore recommended to the tagging style exemplified
in this document as regards prefix and element name capitalization,
double-quoting (") of attribute values, and not starting more than
one META tag on a line. There is much room for flexibility, but
choosing a style and sticking with it will likely make metadata
manipulation and editing easier. The following META tags adhere to
the recommendations and carry identical metadata in three different
styles:
<META NAME="DC.Format"
CONTENT="text/html; 12 Kbytes">
<meta
Content = "text/html; 12 Kbytes"
Name = "DC.Format"
>
<meta name = "DC.Format" content = "text/html; 12 Kbytes">
Use of these recommendations is known to result in metadata that may
be harvested, indexed, and manipulated by popular, freely available
software packages such as [SWISH-E], [freeWAIS-sf2.0], [GLIMPSE],
[HARVEST], and [ISEARCH], among others. These conventions also work
with the metadata processing scripts appearing in the appendix, as
well as with most of the [DCPROJECTS] applications referenced from
the [DCHOME] site. Software support for the LINK tag and qualifier
conventions (see the next section) is not currently widespread.
Ordering of metadata elements is not preserved in general. Writers
of software for metadata indexing and display should try to preserve
relative ordering among META tagged elements having the same name
(e.g., among multiple authors), however, metadata providers and
searchers have no guarantee that ordering will be preserved in
metadata that passes through unknown systems.
6. Dublin Core in Real Descriptions
In actual resource description it is often necessary to qualify
Dublin Core elements to add nuances of meaning. While neither the
general principles nor the specific semantics of DC qualifiers are
within scope of this document, everyday uses of the qualifier syntax
are illustrated to lend realism to later examples. Without further
explanation, the three ways in which the optional qualifier syntax is
currently (subject to change) used to supplement the META tag may be
summarized as follows:
<meta lang = "LANGUAGE_OF_METADATA_CONTENT" ... >
<meta scheme = "CONTROLLED_FORMAT_OR_VOCABULARY_OF_METADATA" ... >
Kunze Informational [Page 5]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
<meta name = "PREFIX.ELEMENT_NAME.SUBELEMENT_NAME" ... >
Accordingly, a posthumous work in Spanish might be described with
<meta name = "DC.Language"
scheme = "rfc1766"
content = "es">
<meta name = "DC.Title"
lang = "es"
content = "La Mesa Verde y la Silla Roja">
<meta name = "DC.Title"
lang = "en"
content = "The Green Table and the Red Chair">
<meta name = "DC.Date.Created"
content = "1935">
<meta name = "DC.Date.Available"
content = "1939">
Note that the qualifier syntax and label suffixes (which follow an
element name and a period) used in examples in this document merely
reflect current trends in the HTML encoding of qualifiers. Use of
this syntax and these suffixes is neither a standard nor a
recommendation.
7. Encoding Dublin Core Elements
This section consists of very simple Dublin Core encoding examples,
arranged by element.
Title (name given to the resource)
-----
<meta name = "DC.Title"
content = "Polycyclic aromatic hydrocarbon contamination">
<meta name = "DC.Title"
content = "Crime and Punishment">
<meta name = "DC.Title"
content = "Methods of Information in Medicine, Vol 32, No 4">
<meta name = "DC.Title"
content = "Still life #4 with flowers">
<meta name = "DC.Title"
lang = "de"
content = "Das Wohltemperierte Klavier, Teil I">
Kunze Informational [Page 6]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
Creator (entity that created the content)
-------
<meta name = "DC.Creator"
content = "Gogh, Vincent van">
<meta name = "DC.Creator"
content = "van Gogh, Vincent">
<meta name = "DC.Creator"
content = "Mao Tse Tung">
<meta name = "DC.Creator"
content = "Mao, Tse Tung">
<meta name = "DC.Creator"
content = "Plato">
<meta name = "DC.Creator"
lang = "fr"
content = "Platon">
<meta name = "DC.Creator.Director"
content = "Sturges, Preston">
<meta name = "DC.Creator.Writer"
content = "Hecht, Ben">
<meta name = "DC.Creator.Producer"
content = "Chaplin, Charles">
Subject (topic or keyword)
-------
<meta name = "DC.Subject"
content = "heart attack">
<meta name = "DC.Subject"
scheme = "MESH"
content = "Myocardial Infarction; Pericardial Effusion">
<meta name = "DC.Subject"
content = "vietnam war">
<meta name = "DC.Subject"
scheme = "LCSH"
content = "Vietnamese Conflict, 1961-1975">
<meta name = "DC.Subject"
content = "Friendship">
<meta name = "DC.Subject"
scheme = "ddc"
content = "158.25">
Kunze Informational [Page 7]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
Description (account, summary, or abstract of the content)
-----------
<meta name = "DC.Description"
lang = "en"
content = "The Author gives some Account of Himself and Family
-- His First Inducements to Travel -- He is
Shipwrecked, and Swims for his Life -- Gets safe on
Shore in the Country of Lilliput -- Is made a
Prisoner, and carried up the Country">
<meta name = "DC.Description"
content = "A tutorial and reference manual for Java.">
<meta name = "DC.Description"
content = "Seated family of five, coconut trees to the left,
sailboats moored off sandy beach to the right,
with volcano in the background.">
Publisher (entity that made the resource available)
---------
<meta name = "DC.Publisher"
content = "O'Reilly">
<meta name = "DC.Publisher"
content = "Digital Equipment Corporation">
<meta name = "DC.Publisher"
content = "University of California Press">
<meta name = "DC.Publisher"
content = "State of Florida (USA)">
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -