⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch14.htm

📁 Web_Programming_with_Perl5,一个不错的Perl语言教程。
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">



<HTML>







<HEAD>



<!-- This document was created from RTF source by rtftohtml version 3.0.1 -->







	<META NAME="GENERATOR" Content="Symantec Visual Page 1.0">



	<META HTTP-EQUIV="Content-Type" CONTENT="text/html;CHARSET=iso-8859-1">



	<TITLE>Without a title - Title</TITLE>



</HEAD>







<BODY BACKGROUND="r2harch.gif" tppabs="http://210.32.137.15/ebook/Web%20Programming%20with%20Perl%205/r2harch.gif" TEXT="#000000" BGCOLOR="#FFFFFF">







<H2 ALIGN="CENTER"><A HREF="ch13.htm" tppabs="http://210.32.137.15/ebook/Web%20Programming%20with%20Perl%205/ch13.htm"><IMG SRC="blanprev.gif" tppabs="http://210.32.137.15/ebook/Web%20Programming%20with%20Perl%205/blanprev.gif" WIDTH="37" HEIGHT="37"



ALIGN="BOTTOM" BORDER="2"></A><A HREF="index-1.htm" tppabs="http://210.32.137.15/ebook/Web%20Programming%20with%20Perl%205/index-1.htm"><IMG SRC="blantoc.gif" tppabs="http://210.32.137.15/ebook/Web%20Programming%20with%20Perl%205/blantoc.gif" WIDTH="42"



HEIGHT="37" ALIGN="BOTTOM" BORDER="2"></A><A HREF="ch15.htm" tppabs="http://210.32.137.15/ebook/Web%20Programming%20with%20Perl%205/ch15.htm"><IMG SRC="blannext.gif" tppabs="http://210.32.137.15/ebook/Web%20Programming%20with%20Perl%205/blannext.gif"



WIDTH="45" HEIGHT="37" ALIGN="BOTTOM" BORDER="2"></A><BR>



<BR>



<FONT COLOR="#0000AA">14</FONT><BR>



<A NAME="Heading1"></A><FONT COLOR="#000077">Archive and Document Management<BR>



</FONT>



<HR>



</H2>







<UL>



	<LI><A HREF="#Heading1">Archive and Document Management</A>



	<UL>



		<LI><A HREF="#Heading2">General Archive Management Considerations</A>



		<UL>



			<LI><A HREF="#Heading3">Planning, Design, and Layout</A>



			<LI><A HREF="#Heading5">Revision Control</A>



			<LI><A HREF="#Heading6">Summary of Archive Management Issues</A>



		</UL>



		<LI><A HREF="#Heading7">Parsing, Converting, Editing, and Verifying HTML with Perl</A>



		<UL>



			<LI><A HREF="#Heading8">General Parsing Issues</A>



		</UL>



		<LI><A HREF="#Heading9">Listing 14.1. simpleparse.</A>



		<LI><A HREF="#Heading10">Listing 14.2. simpleparse-net.</A>



		<UL>



			<LI><A HREF="#Heading11">Editing and Verifying HTML</A>



		</UL>



		<LI><A HREF="#Heading12">Listing 14.3. relativize.</A>



		<UL>



			<LI><A HREF="#Heading13">Parsing HTTP Logfiles</A>



		</UL>



		<LI><A HREF="#Heading14">Listing 14.4. GD_Logfile.pm.</A>



		<LI><A HREF="#Heading15">Listing 14.5. GD_Logfile.test.</A>



		<UL>



			<LI><A HREF="#Heading16">Converting Existing Documentation to HTML</A>



			<LI><A HREF="#Heading18">Converting HTML to Other Formats</A>



			<LI><A HREF="#Heading19">Making Existing Archives Available via HTTP</A>



		</UL>



		<LI><A HREF="#Heading21">Summary</A>



	</UL>



</UL>







<P>



<HR>



</P>







<UL>



	<LI>General Archive Management Considerations



	<P>



	<LI>HTML with Perl



</UL>







<P>The typical Webmaster is often challenged by tasks other than creating HTML or



writing CGI programs. He or she also must be familiar with many other techniques



and practices that are commonly used to build and maintain a networked archive and



its components. In this chapter, we'll discuss a number of those tasks and provide



you with some tools to help accomplish them.



<H3 ALIGN="CENTER"><A NAME="Heading2"></A><FONT COLOR="#000077">General Archive Management



Considerations</FONT></H3>



<P>The art and philosophy of archive management on a network predates the Web by



a long time. One of the primary intents of the Internet was, and still is, to allow



the sharing of documents. Some of the early protocols and tools for sharing electronic



resources are still in wide use today, including FTP, NFS, and even Gopher.</P>



<P>When making resources available via any type of server, you need to consider a



number of tactics and practices. Some of these are related to security and are explored



in Chapter 3, &quot;Security on the Web.&quot; There are many others, and as far



as I know, a document which covers them all does not exist. The collective experience



of the many thousands of administrators who have contributed to and defined this



body of knowledge would be difficult to summarize in a library, much less a single



chapter in a book.</P>



<P>There are, however, a number of general issues that you become aware of as you



develop an archive and explore the work that others have done. I hope to cover many



of the important issues and their associated tasks in this chapter. Again, and as



always, you can explore other resources, including Usenet, various Web sites, and



possibly even individual administrators who you feel have done things the way you



believe might work for you. I suggest that if you find such a site, you might try



dropping a line to the administrator, asking him or her to share a few tips. Of course,



you may be completely ignored, but you may also be rewarded with a buried bone or



two, which might save you time and energy in the future.</P>



<P>You'll notice in this chapter that Perl isn't the primary topic on every page.



As we've said, the intent of this book is to show and teach you how to use Perl in



your Web programming duties and tasks. On the other hand, in other works we've studied,



the coverage of the issues and topics in this chapter seems to be rather minimal.



I'm covering some of the topics in this chapter primar-<BR>



ily for the sake of completeness.



<H4 ALIGN="CENTER"><A NAME="Heading3"></A><FONT COLOR="#000077">Planning, Design,



and Layout</FONT></H4>



<P>The structure and layout of your archive is one of the important decisions you'll



make if you're just starting out. There are a number of issues to consider, and decisions



to make, when you're first laying out your archive. After you've made these decisions,



it won't be quite so easy to make changes to the structure and/or layout. You should



plan carefully and try to consider all of the possibilities for what may happen to



your archive in the future--before you ever create the first directory or file. Let's



consider some of the most important issues now. <B><TT>Document Naming</TT></B> The



names that you give to your documents and directories are important for several reasons.



First, and possibly most useful to you as the archive maintainer, is to have some



sort of notion of what's inside a document or directory, based on its name. Another



consideration is whether the files and directories you'll create must be usable on



DOS or other architectures that don't support long filenames.







<DL>



	<DT></DT>



</DL>







<H3 ALIGN="CENTER">



<HR WIDTH="82%">



<BR>



<FONT COLOR="#000077">NOTE:</FONT></H3>











<BLOCKQUOTE>



	<P>There are essentially two schools of thought on naming file system elements. The



	first stipulates that one should assign names to the elements within an archive that



	allow for the ability to determine the contents of the file or directory based on



	the name. The second, also known as the ISO9660 specification, stipulates that the



	names should follow the 8.3 format and use only alphanumeric characters. Obviously,



	the restriction to only eight characters in the primary component of the name restricts



	your ability to assign names based on contents. You should consider whether your



	archive will ever need to reside on an operating sys-<BR>



	tem that requires the 8.3 format (DOS), or whether you'll ever make it available



	via <BR>



	CD-ROM. In either case, you'll probably want to choose the ISO9660 naming conventions.



	Don't forget the possibility that in the future, you may wish to have your archive



	mirrored to a system that doesn't support long names as well. If you're already running



	under a file system that handles long names and need to migrate or mirror your archive



	to a system that only handles the short names, you might have to make some major



	changes in order for everything to work. We'll discuss how to perform this transition



	later in this chapter, in the section entitled &quot;Moving an Entire Archive,&quot;



	but it's definitely nontrivial.<BR>



	



<HR>











</BLOCKQUOTE>







<P>In any case, you'll need to reserve the extension component of the filenames for



MIME typing, which allows your server to properly send the browser the appropriate



instructions for how to handle the document. See Chapter 5, &quot;Putting It All



Together,&quot; for more details. Be sure to check that your server's mime.types



and srm.conf files follow the standard conventions for extensions, and add configuration



entries to your server for any additional types that you define. <B><TT>Archive Hierarchical



Organization</TT></B> The directory tree that makes up your archive is one that you'll



be &quot;climbing&quot; up and down quite often. You should make its branches easy



to remember and intuitive to understand. Each new resource in your archive will have



to be stored somewhere in this tree. When you use an unambiguous, comprehensive structure



for classification of resources according to their storage location, deciding where



to place things (and where to find them later) will be a lot easier.</P>



<P>After you've decided on a naming convention, you'll want to spend some time planning



the structure of the directories. Naturally, if you're using long names, you can



be pretty creative with your layout; if not, then I recommend that you use some sort



of simple mapping from an ordered list of eight-character names to corresponding



groups or classifications.</P>



<P>You might point out, and you'd be correct, that the structure of the HTML document



already gives the notion of hierarchy to the resources to which it refers. However,



this applies only to the browser and gives no advantage to the maintainer of the



documents. Creating structure, in the form of directories (or folders) in your archive,



makes the HTML a bit more complicated to write but relieves the confusion and intimidation



of having all the files reside in one location.







<DL>



	<DT></DT>



</DL>







<P><B><TT>Configuration Management</TT></B>











<BLOCKQUOTE>



	<P>&quot;A set of procedures for tracking and managing software throughout its lifecycle&quot;



	(Configuration Management for Software, Compton &amp; Connor, 1994, ISBN <BR>



	0-442-01746-4).







</BLOCKQUOTE>







<P>This notion of structure also arises from the science of configuration management



in general. We'll be discussing another aspect of configuration management, revision



control, later in this chapter. <B><TT>Access and Security</TT></B> Another advantage



of creating a structured archive is the ability to restrict access on a per-directory



basis of most HTTP servers. Configuring the server to do this has been covered elsewhere,



and I won't go into how it's done here. I point it out only to highlight the added



value of planning and creating a sound directory structure for your archive. Of course,



the implication is that you've planned carefully and created the structure in such



a way as to use this feature selectively, another consideration in the planning stage.



<B><TT>Top-Level Documentation</TT></B> Every archive directory should have some



sort of an explanation of what its purpose is and, ideally, a description of the



contents. Whether this description is intended only for the maintainer and/or for



public access is up to you. Ideally, this file would be located within the directory



that it describes. It could be the <TT>index.html</TT> and thus serve the dual purpose



of describing the contents to the browser and the maintainer. This document (probably



just a text file) will help the person considering a change to the archive's contents



decide whether this location is appropriate for the change or addition.



<H4 ALIGN="CENTER"><A NAME="Heading5"></A><FONT COLOR="#000077">Revision Control</FONT></H4>



<P>The process (and rigor) of revision control is often overlooked or even ignored



when an administrator manages an archive. However, there are some very good reasons



you should use some sort of version control when creating and updating your resources.



<B><TT>A Policy for Change--Description</TT></B> The process of making your documents



available via the Web is really one of publishing. When you, as a representative



of a company, make a document available, you're making a statement that represents



your company. While some of the issues and legalities are still murky, you should



consider the liability that you or your company assumes when making documents available.



The information within the documents should be correct, and insofar as is possible,



verifiable, and free of misrepresentations.</P>



<P>Such considerations give rise to the need for a policy for the management of the



⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -