⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 classparser.pm

📁 该软件可以方便的把HTML网页解析成一棵Tree
💻 PM
📖 第 1 页 / 共 3 页
字号:
		# See if the function is a built-in one; if not, check in the		# class map (if it exists).		##		if ( /^(\w+)::/ && exists $function_map{ $1 } ) {			$func = $function_map{ $1 };		} else {			$func = $class_map->{ $_ } if $class_map;		}		##		# There is no function having the given class name: silently		# ignore it...maybe it's a style name.		##		next unless $func;		##		# Got a function: call it to visit the current HTML node.		##		$result = $func->( $object, $node, $_, $is_end_tag );		##		# For end tags, we call the function only for the first class		# since it makes no sense to call more than one.		##		last if $is_end_tag;		##		# For start tags, we "short-circuit" the calling of multiple		# classes and return when the first class returns false.		##		return 0 unless $result;	}	if ( $is_end_tag ) {		##		# For end tags, simply emit the end tag and return whatever the		# result is.		##		print '</', $node->name(), '>';		return $result;	}	##	# For start tags, emit it plus all of its attributes.	##	print '<', $node->name();	my $atts = $node->atts();	while ( my( $att, $val ) = each %{ $atts } ) {		print " $att=\"$val\"";	}	print '>';	return 1;}1;__END__=head1 NAMEC<Apache::HTML::ClassParser> - Apache mod_perl extension for generating dynamic HTML pages based on C<CLASS> attributes=head1 SYNOPSIS # In Apache's httpd.conf file: AddType text/html	.chtml <Files *.chtml>   SetHandler		perl-script   PerlHandler		+Apache::HTML::ClassParser </Files>=head1 DESCRIPTIONC<Apache::HTML::ClassParser> is yet another C<mod_perl> Apache modulefor dynamically generating HTML.Its distinctive feature, unlike existing techniques,is that it uses I<pure>, standard HTML files:no print-statement-laden CGI scripts,no embedded statements from some programming langauge,and no pseudo-HTML elements.Code is cleanly separated into a separate file.What links the two together are C<CLASS> attributes for HTML elements.=head1 CONFIGURATIONIn order to have C<ClassParser> be a handler for HTML files,configuration directives, such as those shown in the SYNOPSIS,must be added to Apache's C<httpd.conf> file.The HTML files to be handled by C<ClassParser>,as with all Apache handler modules,can be specified either by location, filename extension, or both.For more detail on Apache configuration directives,see [Apache].=head2 Cooperation with C<Apache::Filter>C<ClassParser> is C<Apache::Filter>-aware.However, if used in a filter chain, it B<must> be the first filter.(This is because it uses the C<HTML::Tree> modulethat uses mmap(2) to read an HTML file.)For example, to configure Apache to have HTML files run throughC<ClassParser> then C<Apache::SSI> (server-side includes), do:    PerlModule		Apache::Filter    PerlModule		Apache::HTML::ClassParser    PerlModule		Apache::SSI    AddType text/html	.chtml    <Files *.chtml>      SetHandler	perl-script      PerlSetVar	Filter On      PerlHandler	Apache::HTML::ClassParser Apache::SSI    </Files>=head1 TERMINOLOGY AND CONVENTIONS=head2 Element vs. TagIt is often the case that the term HTML "tag" is usedwhen the correct term of "element" should be.From the HTML 4.0 specification, section 3.2.1, "Elements":=over 4=itemI<Elements are not tags.Some people refer to elements as tags (e.g., "the >C<P>I< tag").Remember that the element is one thing,and the tag (be it start or end tag) is another.For instance, the >C<HEAD>I< element is always present,even though both start and end >C<HEAD>I< tags may be missing in the markup.>=backIn this documentation,the distinction between "element" and "tag" is necessary.=head2 Class vs. C<CLASS>In this documentation,there are unfortunately two meanings of the word "class":=over 4=item 1.A class attribute of an HTML element, e.g.:    <H1 CLASS="heading_1">Introduction</H1>where C<CLASS>es are typically used to convey style information.=item 2.A Perl class from which objects are created, e.g.:    package MyClass;    sub new {        my $class = shift;        my $this = {};        return bless $this, $class;    }    $object = MyClass->new();(See [Wall], pp. 290-292.)=backTherefore, throughout this document,"class" written as "C<CLASS>"shall mean the class attribute of an HTML element (case 1)and "class" written simply as "class"shall mean a Perl class (case 2).=head1 The HTML FileThe file for a web page is in pure HTML.(It can also contain JavaScript code,but that's irrelevant for the purpose of this discussion.)At every location in the HTML file where something is to happen dynamically,an HTML element must contain a C<CLASS> attribute(and perhaps some "dummy" content).(The dummy content allows the web page designer to create a mock-up page.)For example,suppose the options in a menu are to be retrieved from a relational database,say the flavors available on an ice cream shop's web site.The HTML would look like this:    <!-- ice_cream.chtml -->    <SELECT NAME="Flavors" CLASS="query_flavors">      <OPTION CLASS="next_flavor" VALUE="0">Tooty Fruity    </SELECT>The C<CLASS>es C<query_flavors> and C<next_flavor>will be used to generate HTML dynamically.The values of the C<CLASS> attributes can be anythingas long as they agree with those in the code file(specified later).The text "Tooty Fruity" is dummy content.The C<query_flavors> C<CLASS> will be usedto perform the query from the database;C<next_flavor> will be used tofetch every tuple returned from the queryand to substitute the name and ID number of the flavor.The value of a C<CLASS> attribute may contain multiple classesseparated by whitespace.(More on this later.)=head1 The Code FileFor every HTML file that is to be used with this technique,there B<must> be an associated code file in Perl.It B<must> have the same name as the HTML fileexcept that the extension is C<.pm> rather than C<.chtml>.(There is an exception; see "Using a Different Code File via C<pm_uri>" below.)That code file B<must> define its own package (a.k.a. module), e.g.:    # ice_cream.pm    package IceCream;to implement a class;that class B<must> have a constructor.=head2 The Constructor and C<class_map>A package requires a constructor method that B<must> be named C<new()>.A minimal such constructor(for which most of the code is taken from [Wall], p. 295)is:    sub new {        my $that = shift;        my $class = ref( $that ) || $that;        my $this = {            class_map => {                query_flavors => \&query_flavors,                next_flavor   => \&next_flavor,            },            # other stuff you want here ...        };        return bless $this, $class;    }A second requirement is that the object's hashB<must> contain a C<class_map> keywhose value is a reference to a hashcontaining a mapping from C<CLASS> attribute valuesfrom the HTML file to functions(methods of the class)in the Perl file.B<This is the key concept>:it is the C<class_map> that links the HTML file to the Perl code.In the above constructor,the C<CLASS> names C<query_flavors> and C<next_flavor>both map to methods having the same name.In practice, this probably will (and should) be the case;however, there is no requirement that it be so.This allows more than one C<CLASS> name to map to the same method.(If there were such a requirement,there would be no need for the C<class_map>.)=head2 Class MethodsClass methods are passed the following arguments:=over 15=item C<$this>A reference to an object of a class.=item C<$node>A reference to an HTML element node, e.g., C<SELECT>.=item C<$class>The value of the C<CLASS> attribute of the HTML elementthe method is being called for.This is useful if more than one classmaps to the same method.=item C<$is_end_tag>True only when the method is being called for the end tag of an HTML element.=backClass methods B<must> return a Boolean value(zero or non-zero for false or true, respectively).There are two meanings for the return value;they are the same as for I<visitor> functions used by B<HTML::Tree>.Repeated here for convenience, they are:=over 4=item 1.If the $is_end_tag argument is false,returning false means:do not visit any of the current node's child nodes,i.e., skip them and proceed directly to the current node's next siblingand also do not call the I<visitor> again for the end tag;returning true means: do visit all child nodesand call the I<visitor> again for the end tag.=item 2.If the $is_end_tag argument is true,returning false means:proceed normally to the next sibling;returning true means:loop back and repeat the visit cycle from the beginningby revisiting the start tag of the current element node(case 1 above).=backThe implementation of the C<query_flavors()> and C<next_flavor()> methodsshall be presented in stages.The C<query_flavors()> method begins by getting its argumentsas described above:    sub query_flavors {        my( $this, $node, $class, $is_end_tag ) = @_;The query must be performed upon encountering the start tagof the C<SELECT> element;therefore, the method returns false immediately if $is_end_tag is true.This tells C<ClassParser> not to proceed with parsing the C<SELECT> element'schild elements again (in this case, the single C<OPTION> element)and to proceed to its next sibling element(i.e., the element after the C</SELECT> end tag):        return 0 if $is_end_tag;The bulk of the code is standard DBI/SQL.A copy of the database and statement handles is stored in the object's hashso the C<next_flavor()> method can access them later:        $this->{ dbh } = DBI->connect( 'DBI:mysql:ice_cream:localhost' );        $this->{ sth } = $this->{ dbh }->prepare( '            SELECT   flavor_id, flavor_name            FROM     flavors            ORDER BY flavor_name        ' );        $this->{ sth }->execute();(If the C<Apache::DBI> module was specified ahead of C<ClassParser>in the Apache C<httpd.conf> file,database connections will be transparently persistent.See [Stein], pp. 236-237.)Finally, the method returns true to tell C<ClassParser>to proceed with parsing the C<SELECT> element's child elements:        return 1;    }The C<next_flavor()> method begins identically to C<query_flavors()>:    sub next_flavor {        my( $this, $node, $class, $is_end_tag ) = @_;The fetch of the next tuple from the querymust be performed upon encountering the start tag of the C<OPTION> element;therefore, the method returns true immediately if $is_end_tag is true.This tells C<ClassParser> to loop back to the beginning of the element,in this case to do another fetch:        return 1 if $is_end_tag;The next portion of code fetches a tuple from the database.If there are no more tuples,the method returns false.This tells C<ClassParser> not to emit the HTML for the C<OPTION> elementand also tells it to stop looping:        my( $flavor_id, $flavor_name ) = $this->{ sth }->fetchrow();        unless ( $flavor_id ) {            $this->{ sth }->finish();            $this->{ dbh }->disconnect();            return 0;        }The code also disconnects from the database.(However, if C<Apache::DBI> was specified,the C<disconnect()> becomes a no-opand the connection remains persistent.)The next portion of code substitutes content in the HTML that will be emitted.The first line sets the value of the C<OPTION> element's C<VALUE> attributeto be the C<flavor_id> from the tuple:        $node->att( 'value', $flavor_id );and the next line substitutes the text of the first child node(in this case, the text "Tooty Fruity")for the C<flavor_name> from the tuple:        $node->children()->[0]->text( $flavor_name );Finally, the method returns true to tell C<ClassParser>to emit the HTML for the C<OPTION> elementnow containing the dynamically generated content:        return 1;    }=head1 Other Stuff=head2 Built-in CLASSesThere are several HTML manipulations that are performed routinely;therefore, CLASSes to perform these manipulations are built-inwithout needing to be explicitly listed in a C<class_map>.All of the built-in CLASSes always return false when called for an end tag;all but C<if> and C<unless> always return true for a start tag.=over 4=item C<append::>I<attribute>C<::>I<key>Append to the value of an attribute of the current element.The I<attribute> is the name of the HTML attributewhose value is to be appended toand I<key> is the key into $this that contains the value to be appended.This example appends the value of C<$this-E<gt>S<{ flavor_id }>> to the C<HREF>attribute:    <A HREF="flavor_detail.chtml?flavor_id="     CLASS="append::href::flavor_id">Flavor details</A>B<THIS BUILT-IN CLASS IS DEPRECATED IN FAVOR OF >C<sub_param>.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -