📄 rfc2731.txt
字号:
<meta name = "DC.Creator" content = "Simpson, Homer"> <meta name = "DC.Title" content = "(--mbtitle)"> <meta name = "DC.Date.Created" content = "(--mbfilemodtime)"> <meta name = "DC.Identifier" content = "(--mbbaseURL)/(--mbfilename)"> <meta name = "DC.Format" content = "text/html; (--mbfilesize)"> <meta name = "DC.Language" content = "(--mblanguage)-BUREAUCRATESE"> <meta name = "RC.MetadataAuthority" content = "Springfield Nuclear"> <link rel = "schema.DC" href = "http://purl.org/DC/elements/1.0/"> <link rel = "schema.RC" href = "http://nukes.org/ReactorCore/rc"> The above template represents the metadata block that will describe the document once the variable references are replaced with real values. By the conventions of our script, the following variables will be replaced in both the template and in the document: (--mbfilesize) size of the final output file (--mbtitle) title of the document (--mblanguage) language of the document (--mbbaseURL) beginning part of document identifier (--mbfilename) last part (minus .html) of identifier (--mbfilemodtime) last modification date of the documentKunze Informational [Page 16]RFC 2731 Encoding Dublin Core Metadata in HTML December 1999 Here's an example HTML file to run the script on. <html> <head> <!--metablock Nutritional Allocation Increase --> <meta name = "DC.Type" content = "Memorandum"> </head> <body> <p> From: Acting Shift Supervisor To: Plant Control Personnel RE: (--mbtitle) Date: (--mbfilemodtime) <p> Pursuant to directive DOH:10.2001/405aec of article B-2022, subsection 48.2.4.4.1c regarding staff morale and employee productivity standards, the current allocation of doughnut acquisition funds shall be increased effective immediately. </body> </html> Note that because replacement occurs throughout the document, the provider need only enter the title once instead of twice (normally the title must be entered once in the HTML head and again in the HTML body). After running the script, the above file is transformed into this: <html> <head> <title> Nutritional Allocation Increase </title> <meta name = "DC.Creator" content = "Simpson, Homer"> <meta name = "DC.Title" content = "Nutritional Allocation Increase"> <meta name = "DC.Date.Created" content = "1999-03-08"> <meta name = "DC.Identifier" content = "http://moes.bar.com/doh/homer.html"> <meta name = "DC.Format" content = "text/html; 1320 bytes"> <meta name = "DC.Language" content = "en-BUREAUCRATESE"> <meta name = "RC.MetadataAuthority" content = "Springfield Nuclear"> <link rel = "schema.DC" href = "http://purl.org/DC/elements/1.0/"> <link rel = "schema.RC"Kunze Informational [Page 17]RFC 2731 Encoding Dublin Core Metadata in HTML December 1999 href = "http://nukes.org/ReactorCore/rc"> <meta name = "DC.Type" content = "Memorandum"> </head> <body> <p> From: Acting Shift Supervisor To: Plant Control Personnel RE: Nutritional Allocation Increase Date: 1999-03-08 <p> Pursuant to directive DOH:10.2001/405aec of article B-2022, subsection 48.2.4.4.1c regarding staff morale and employee productivity standards, the current allocation of doughnut acquisition funds shall be increased effective immediately. </body> </html> Here is the script that accomplishes this transformation.#!/depot/bin/perl## This Perl script processes metadata block declarations of the form# <!--metablock TITLE_OF_DOCUMENT --> and variable references of the# form (--mbVARNAME), replacing them with full metadata blocks and# variable values, respectively. Requires a "template" file.# Outputs an HTML file.## Invoke this script with a single filename argument, "foo". It creates# an output file "foo.html" using a temporary working file "foo.work".# The size of foo.work is measured after variable replacement, and is# later inserted into the file in such a way that the file's size does# not change in the process. Has little or no error checking.$infile = shift;open(IN, "< $infile") or die("Could not open input file \"$infile\"");$workfile = "$infile.work";unlink($workfile);open(WORK, "+> $workfile") or die("Could not open work file \"$workfile\"");@offsets = (); # records locations for late size replacement$title = ""; # gets the title during metablock processing$language = "en"; # pre-set language here (not in the template)$baseURL = "http://moes.bar.com/doh"; # pre-set base URL here also$filename = "$infile.html"; # final output filename$filesize = "(--mbfilesize)"; # replaced late (separate pass)Kunze Informational [Page 18]RFC 2731 Encoding Dublin Core Metadata in HTML December 1999($year, $month, $day) = (localtime( (stat IN) [9] ))[5, 4, 3];$filemodtime = sprintf "%s-%02s-%02s", 1900 + $year, 1 + $month, $day;sub putout { # outputs current line with variable replacement if (! /\(--mb/) { print WORK; return; } if (/\(--mbfilesize\)/) # remember where it was { push @offsets, tell WORK; } # but don't replace yet s/\(--mbtitle\)/$title/g; s/\(--mblanguage\)/$language/g; s/\(--mbbaseURL\)/$baseURL/g; s/\(--mbfilename\)/$filename/g; s/\(--mbfilemodtime\)/$filemodtime/g; print WORK;}while (<IN>) { # main loop for input file if (! /(.*)<!--metablock\s*(.*)/) { &putout; next; } $title = $2; $_ = $1; &putout; if ($title =~ s/\s*-->(.*)//) { $remainder = $1; } else { while (<IN>) { $title .= $_; last if (/(.*)\s*-->(.*)/); } $title .= $1; $remainder = $2; } open(TPLATE, "< template") or die("Could not open template file"); while (<TPLATE>) # subloop for template file { &putout; } close(TPLATE); $_ = $remainder; &putout;Kunze Informational [Page 19]RFC 2731 Encoding Dublin Core Metadata in HTML December 1999}close(IN);# Now replace filesize variables without altering total byte count.select( (select(WORK), $| = 1) [0] ); # first flush output so weif (($size = -s WORK) < 100000) # can get final file size { $scale = 0; } # and set scale factor orelse { # compute it, keeping width of size field low for ($scale = 0; $size >= 1000; $scale++) { $size /= 1024; }}$filesize = sprintf "%7.7s %sbytes", $size, (" ", "K", "M", "G", "T", "P") [$scale];foreach $pos (@offsets) { # loop through saved size locations seek WORK, $pos, 0; # read the line found there $_ = <WORK>; # $filesize must be exactly as wide as "(--mbfilesize)" s/\(--mbfilesize\)/$filesize/g; seek WORK, $pos, 0; # rewrite it with replacement print WORK;}close(WORK);rename($workfile, "$filename") or die("Could not rename \"$workfile\" to \"$filename\"");# ---- end of Perl script ----Kunze Informational [Page 20]RFC 2731 Encoding Dublin Core Metadata in HTML December 199910. Author's Address John A. Kunze Center for Knowledge Management University of California, San Francisco 530 Parnassus Ave, Box 0840 San Francisco, CA 94143-0840, USA Fax: +1 415-476-4653 EMail: jak@ckm.ucsf.edu11. References [AAT] Art and Architecture Thesaurus, Getty Information Institute. http://shiva.pub.getty.edu/aat_browser/ [AC] The A-Core: Metadata about Content Metadata, (in progress) http://metadata.net/ac/draft-iannella-admin-01.txt [DC1] Weibel, S., Kunze, J., Lagoze, C. and M. Wolf, "Dublin Core Metadata for Resource Discovery", RFC 2413, September 1998. ftp://ftp.isi.edu/in-notes/rfc2413.txt [DCHOME] Dublin Core Initiative Home Page. http://purl.org/DC/ [DCPROJECTS] Projects Using Dublin Core Metadata. http://purl.org/DC/projects/index.htm [DCT1] Dublin Core Type List 1, DC Type Working Group, March 1999. http://www.loc.gov/marc/typelist.html [freeWAIS-sf2.0] The enhanced freeWAIS distribution, February 1999. http://ls6-www.cs.uni- dortmund.de/ir/projects/freeWAIS-sf/ [GLIMPSE] Glimpse Home Page. http://glimpse.cs.arizona.edu/ [HARVEST] Harvest Web Indexing. http://www.tardis.ed.ac.uk/harvest/Kunze Informational [Page 21]RFC 2731 Encoding Dublin Core Metadata in HTML December 1999 [HTML4.0] Hypertext Markup Language 4.0 Specification, April 1998. http://www.w3.org/TR/REC-html40/ [ISEARCH] Isearch Resources Page. http://www.etymon.com/Isearch/ [ISO639-2] Code for the representation of names of languages, 1996. http://www.indigo.ie/egt/standards/iso639/iso639-2- en.html [ISO8601] ISO 8601:1988(E), Data elements and interchange formats -- Information interchange -- Representation of dates and times, International Organization for Standardization, June 1988. http://www.iso.ch/markete/8601.pdf [MARC] USMARC Format for Bibliographic Data, US Library of Congress. http://lcweb.loc.gov/marc/marc.html [PERL] L. Wall, T. Christiansen, R. Schwartz, Programming Perl, Second Edition, O'Reilly, 1996. [RDF] Resource Description Framework Model and Syntax Specification, February 1999. http://www.w3.org/TR/REC-rdf-syntax/ [RFC1766] Alvestrand, H., "Tags for the Identification of Languages", RFC 1766, March 1996. ftp://ftp.isi.edu/in-notes/rfc1766.txt [SWISH-E] Simple Web Indexing System for Humans - Enhanced. http://sunsite.Berkeley.EDU/SWISH-E/ [TGN] Thesaurus of Geographic Names, Getty Information Institute. http://shiva.pub.getty.edu/tgn_browser/ [WTN8601] W3C Technical Note - Profile of ISO 8601 Date and Time Formats. http://www.w3.org/TR/NOTE-datetime [XML] Extensible Markup Language (XML). http://www.w3.org/TR/REC-xmlKunze Informational [Page 22]RFC 2731 Encoding Dublin Core Metadata in HTML December 199912. Full Copyright Statement Copyright (C) The Internet Society (1999). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.Kunze Informational [Page 23]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -