📄 rfc2731.txt
字号:
<meta name = "DC.Creator"
content = "Simpson, Homer">
<meta name = "DC.Title"
content = "(--mbtitle)">
<meta name = "DC.Date.Created"
content = "(--mbfilemodtime)">
<meta name = "DC.Identifier"
content = "(--mbbaseURL)/(--mbfilename)">
<meta name = "DC.Format"
content = "text/html; (--mbfilesize)">
<meta name = "DC.Language"
content = "(--mblanguage)-BUREAUCRATESE">
<meta name = "RC.MetadataAuthority"
content = "Springfield Nuclear">
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.0/">
<link rel = "schema.RC"
href = "http://nukes.org/ReactorCore/rc">
The above template represents the metadata block that will describe
the document once the variable references are replaced with real
values. By the conventions of our script, the following variables
will be replaced in both the template and in the document:
(--mbfilesize) size of the final output file
(--mbtitle) title of the document
(--mblanguage) language of the document
(--mbbaseURL) beginning part of document identifier
(--mbfilename) last part (minus .html) of identifier
(--mbfilemodtime) last modification date of the document
Kunze Informational [Page 16]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
Here's an example HTML file to run the script on.
<html>
<head>
<!--metablock Nutritional Allocation Increase -->
<meta name = "DC.Type"
content = "Memorandum">
</head>
<body>
<p>
From: Acting Shift Supervisor
To: Plant Control Personnel
RE: (--mbtitle)
Date: (--mbfilemodtime)
<p>
Pursuant to directive DOH:10.2001/405aec of article B-2022,
subsection 48.2.4.4.1c regarding staff morale and employee
productivity standards, the current allocation of doughnut
acquisition funds shall be increased effective immediately.
</body>
</html>
Note that because replacement occurs throughout the document, the
provider need only enter the title once instead of twice (normally
the title must be entered once in the HTML head and again in the HTML
body). After running the script, the above file is transformed into
this:
<html>
<head>
<title> Nutritional Allocation Increase </title>
<meta name = "DC.Creator"
content = "Simpson, Homer">
<meta name = "DC.Title"
content = "Nutritional Allocation Increase">
<meta name = "DC.Date.Created"
content = "1999-03-08">
<meta name = "DC.Identifier"
content = "http://moes.bar.com/doh/homer.html">
<meta name = "DC.Format"
content = "text/html; 1320 bytes">
<meta name = "DC.Language"
content = "en-BUREAUCRATESE">
<meta name = "RC.MetadataAuthority"
content = "Springfield Nuclear">
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.0/">
<link rel = "schema.RC"
Kunze Informational [Page 17]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
href = "http://nukes.org/ReactorCore/rc">
<meta name = "DC.Type"
content = "Memorandum">
</head>
<body>
<p>
From: Acting Shift Supervisor
To: Plant Control Personnel
RE: Nutritional Allocation Increase
Date: 1999-03-08
<p>
Pursuant to directive DOH:10.2001/405aec of article B-2022,
subsection 48.2.4.4.1c regarding staff morale and employee
productivity standards, the current allocation of doughnut
acquisition funds shall be increased effective immediately.
</body>
</html>
Here is the script that accomplishes this transformation.
#!/depot/bin/perl
#
# This Perl script processes metadata block declarations of the form
# <!--metablock TITLE_OF_DOCUMENT --> and variable references of the
# form (--mbVARNAME), replacing them with full metadata blocks and
# variable values, respectively. Requires a "template" file.
# Outputs an HTML file.
#
# Invoke this script with a single filename argument, "foo". It creates
# an output file "foo.html" using a temporary working file "foo.work".
# The size of foo.work is measured after variable replacement, and is
# later inserted into the file in such a way that the file's size does
# not change in the process. Has little or no error checking.
$infile = shift;
open(IN, "< $infile")
or die("Could not open input file \"$infile\"");
$workfile = "$infile.work";
unlink($workfile);
open(WORK, "+> $workfile")
or die("Could not open work file \"$workfile\"");
@offsets = (); # records locations for late size replacement
$title = ""; # gets the title during metablock processing
$language = "en"; # pre-set language here (not in the template)
$baseURL = "http://moes.bar.com/doh"; # pre-set base URL here also
$filename = "$infile.html"; # final output filename
$filesize = "(--mbfilesize)"; # replaced late (separate pass)
Kunze Informational [Page 18]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
($year, $month, $day) = (localtime( (stat IN) [9] ))[5, 4, 3];
$filemodtime = sprintf "%s-%02s-%02s", 1900 + $year, 1 + $month, $day;
sub putout { # outputs current line with variable replacement
if (! /\(--mb/) {
print WORK;
return;
}
if (/\(--mbfilesize\)/) # remember where it was
{ push @offsets, tell WORK; } # but don't replace yet
s/\(--mbtitle\)/$title/g;
s/\(--mblanguage\)/$language/g;
s/\(--mbbaseURL\)/$baseURL/g;
s/\(--mbfilename\)/$filename/g;
s/\(--mbfilemodtime\)/$filemodtime/g;
print WORK;
}
while (<IN>) { # main loop for input file
if (! /(.*)<!--metablock\s*(.*)/) {
&putout;
next;
}
$title = $2;
$_ = $1;
&putout;
if ($title =~ s/\s*-->(.*)//) {
$remainder = $1;
}
else {
while (<IN>) {
$title .= $_;
last if (/(.*)\s*-->(.*)/);
}
$title .= $1;
$remainder = $2;
}
open(TPLATE, "< template")
or die("Could not open template file");
while (<TPLATE>) # subloop for template file
{ &putout; }
close(TPLATE);
$_ = $remainder;
&putout;
Kunze Informational [Page 19]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
}
close(IN);
# Now replace filesize variables without altering total byte count.
select( (select(WORK), $| = 1) [0] ); # first flush output so we
if (($size = -s WORK) < 100000) # can get final file size
{ $scale = 0; } # and set scale factor or
else { # compute it, keeping width of size field low
for ($scale = 0; $size >= 1000; $scale++)
{ $size /= 1024; }
}
$filesize = sprintf "%7.7s %sbytes",
$size, (" ", "K", "M", "G", "T", "P") [$scale];
foreach $pos (@offsets) { # loop through saved size locations
seek WORK, $pos, 0; # read the line found there
$_ = <WORK>;
# $filesize must be exactly as wide as "(--mbfilesize)"
s/\(--mbfilesize\)/$filesize/g;
seek WORK, $pos, 0; # rewrite it with replacement
print WORK;
}
close(WORK);
rename($workfile, "$filename")
or die("Could not rename \"$workfile\" to \"$filename\"");
# ---- end of Perl script ----
Kunze Informational [Page 20]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
10. Author's Address
John A. Kunze
Center for Knowledge Management
University of California, San Francisco
530 Parnassus Ave, Box 0840
San Francisco, CA 94143-0840, USA
Fax: +1 415-476-4653
EMail: jak@ckm.ucsf.edu
11. References
[AAT] Art and Architecture Thesaurus, Getty Information
Institute.
http://shiva.pub.getty.edu/aat_browser/
[AC] The A-Core: Metadata about Content Metadata, (in
progress)
http://metadata.net/ac/draft-iannella-admin-01.txt
[DC1] Weibel, S., Kunze, J., Lagoze, C. and M. Wolf,
"Dublin Core Metadata for Resource Discovery", RFC
2413, September 1998.
ftp://ftp.isi.edu/in-notes/rfc2413.txt
[DCHOME] Dublin Core Initiative Home Page.
http://purl.org/DC/
[DCPROJECTS] Projects Using Dublin Core Metadata.
http://purl.org/DC/projects/index.htm
[DCT1] Dublin Core Type List 1, DC Type Working Group,
March 1999.
http://www.loc.gov/marc/typelist.html
[freeWAIS-sf2.0] The enhanced freeWAIS distribution, February 1999.
http://ls6-www.cs.uni-
dortmund.de/ir/projects/freeWAIS-sf/
[GLIMPSE] Glimpse Home Page.
http://glimpse.cs.arizona.edu/
[HARVEST] Harvest Web Indexing.
http://www.tardis.ed.ac.uk/harvest/
Kunze Informational [Page 21]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
[HTML4.0] Hypertext Markup Language 4.0 Specification, April
1998.
http://www.w3.org/TR/REC-html40/
[ISEARCH] Isearch Resources Page.
http://www.etymon.com/Isearch/
[ISO639-2] Code for the representation of names of languages,
1996.
http://www.indigo.ie/egt/standards/iso639/iso639-2-
en.html
[ISO8601] ISO 8601:1988(E), Data elements and interchange
formats -- Information interchange -- Representation
of dates and times, International Organization for
Standardization, June 1988.
http://www.iso.ch/markete/8601.pdf
[MARC] USMARC Format for Bibliographic Data, US Library of
Congress.
http://lcweb.loc.gov/marc/marc.html
[PERL] L. Wall, T. Christiansen, R. Schwartz, Programming
Perl, Second Edition, O'Reilly, 1996.
[RDF] Resource Description Framework Model and Syntax
Specification, February 1999.
http://www.w3.org/TR/REC-rdf-syntax/
[RFC1766] Alvestrand, H., "Tags for the Identification of
Languages", RFC 1766, March 1996.
ftp://ftp.isi.edu/in-notes/rfc1766.txt
[SWISH-E] Simple Web Indexing System for Humans - Enhanced.
http://sunsite.Berkeley.EDU/SWISH-E/
[TGN] Thesaurus of Geographic Names, Getty Information
Institute.
http://shiva.pub.getty.edu/tgn_browser/
[WTN8601] W3C Technical Note - Profile of ISO 8601 Date and
Time Formats.
http://www.w3.org/TR/NOTE-datetime
[XML] Extensible Markup Language (XML).
http://www.w3.org/TR/REC-xml
Kunze Informational [Page 22]
RFC 2731 Encoding Dublin Core Metadata in HTML December 1999
12. Full Copyright Statement
Copyright (C) The Internet Society (1999). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgement
Funding for the RFC Editor function is currently provided by the
Internet Society.
Kunze Informational [Page 23]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -