📄 lite.pm
字号:
############################################################## Module: XML::Lite## Created: 25 August 2001 by Jeremy Wadsack for Wadsack-Allen Digital Group# Copyright (C) 2001 Wadsack-Allen. All rights reserved.## TODO# * Need to support <!...> for doctypes, and doctype delarations# * Could add a method 'element' that accepts path-like syntax# * Could add write_to_file, to_string, etc. methods (requires that the orig doc be preserved!)# * Could improve support for comments, CDATA, PI's etc as objects?# * Expose handler interface# * Expose a method to provide better error handling############################################################## Date Modification Author# ----------------------------------------------------------# 04.Sep.2001 Fixed lots of bugs and built tests JW# 08.Sep.2001 Added linked list & handlers to parser JW# 04.Nov.2001 Fixed bug in parameter handling JW############################################################package XML::Lite;use strict;#$^W=1; # 'use warnings;' in perl 5.005_62 and later=head1 NAMEXML::Lite - A lightweight XML parser for simple files=head1 SYNOPSISuse XML::Lite;my $xml = new XML::Lite( xml => 'a_file.xml' );=head1 DESCRIPTIONXML::Lite is a lightweight XML parser, with basic element traversing methods. It is entirely self-contained, pure Perl (i.e. I<not> based on expat). It provides useful methods for reading most XML files, including traversing and finding elements, reading attributes and such. It is designed to take advantage of Perl-isms (Attribute lists are returned as hashes, rather than, say, lists of objects). It provides only methods for reading a file, currently.=head1 METHODSThe following methods are available:=over 4=cut use XML::Lite::Element;BEGIN { use vars qw( $VERSION @ISA ); $VERSION = '0.11'; @ISA = qw();} # end BEGIN# non-exported package globals go hereuse vars qw( %ERRORS );# Predefined error messages in English%ERRORS = ( NO_START => "A closing tag (\%1) was found with no corresponding start tag at position \%0 in your XML file.\n", NO_ROOT => "Your XML document must begin with a root element.\n", ROOT_NOT_CLOSED => "The root element of your XML document (starting at position \%0) is incomplete.\n", ELM_NOT_CLOSED => "The XML-like element starting at position \%0 is incomplete. (Did you forget to escape a '<'?)\n",);############################## The object constructor ##############################=item my $xml = new XML::Lite( xml => $source[, ...] );Creates a new XML::Lite object. The XML::Lite object acts as the documentobject for the $source that is sent to it to parse. This means that you create a new object for each document (or document sub-section). As the objects are lightweight this should not be a performance consideration.The object constructor can take several named parameters. Parameter namesmay begin with a '-' (as in the example above) but are not required to. The following parameters are recognized. xml The source XML to parse. This can be a filename, a scalar that contains the document (or document fragment), or an IO handle. As a convenince, if only on parameter is given, it is assumed to be the source.So you can use this, if you wish: my $xml = new XML::Lite( 'file.xml' );=cutsub new { my $self = {}; my $proto = shift; my %parms; my $class = ref($proto) || $proto; # Parse parameters $self->{settings} = {}; if( @_ > 1 ) { my($k, $v); local $_; %parms = @_; while( ($k, $v) = each %parms ) { $k =~ s/^-//; # Removed leading '-' if it exists. (Why do Perl programmers use this?) $self->{settings}{$k} = $v; } # end while } else { $self->{settings}{xml} = $_[0]; } # end if; bless ($self, $class); # Some defaults $self->{doc_offset} = 0; $self->{doc} = ''; $self->{_CDATA} = []; $self->{handlers} = {}; # Refer to global error messages $self->{ERRORS} = $self->{settings}{error_messages} || \%ERRORS; # Now parse the XML document and build look-up tables return undef unless $self->_parse_it(); return $self;} # end new############################ #### Public Methods #### ############################=item my $elm = $xml->root_element()Returns a reference to an XML::Lite::Element object that representsthe root element of the document.Returns C<undef> on errors.=cut# ----------------------------------------------------------# Date Modification Author# ----------------------------------------------------------# 04Sep2001 Added root alias JW# 08Sep2001 Modified to use tree instead of element list JW# 05Nov2001 Added additional aliases JW# ----------------------------------------------------------sub root;*root = \&root_element;sub get_root;*get_root = \&root_element;sub get_root_element;*get_root_element = \&root_element;sub root_element { my $self = shift; return undef unless defined $self->{doc}; # Find the first thing in the root of tree that's an element my $root; foreach( @{$self->{tree}} ) { if( @$_ == 4 ) { $root = $_; last; } # end if } # end foreach return undef unless defined $root; return XML::Lite::Element->new( $self, $root );} # end root_element=item @list = $xml->elements_by_name( $name )Returns a list of all elements that match C<$name>.C<@list> is a list of L<XML::Lite::Element> objectsIf called in a scalar context, this will return thefirst element found that matches (it's more efficient to call in a scalar context than assign the results to a list of one scalar).If no matching elements are found then returns C<undef>in scalar context or an empty list in array context.=cut# ----------------------------------------------------------# Date Modification Author# ----------------------------------------------------------# 27Aug2001 Added method. JW# 04Sep2001 Added element_by_name alias JW# ----------------------------------------------------------sub element_by_name;*element_by_name = \&elements_by_name;sub elements_by_name { my $self = shift; my( $name ) = @_; if( wantarray ) { my @list = (); foreach( @{$self->{elements}{$name}} ) { my $elm = new XML::Lite::Element( $self, $_, ); push @list, $elm if defined $elm; } # end foreach return @list; } else { return new XML::Lite::Element( $self, $self->{elements}{$name}[0] ); } # end if} # end elements_by_name############################ #### Private Methods #### ############################# ----------------------------------------------------------# Sub: _parse_it## Args: (None)## Returns: True value on success, false on failure## Description: Parses the XML file in $self->{settings}{xml}# If this is an IO reference or filename, then reads from that,# else if it starts with '<' assumes it's an XML document.# During parsing, stores and internal database of named elements# for lookups ($self->{elements}) and an internal linked list# of elements and text nodes ($self->{tree}) for traversal.# ----------------------------------------------------------# Date Modification Author# ----------------------------------------------------------# 08Sep2001 Added linked list tree to internal objects JW# ----------------------------------------------------------sub _parse_it { my $self = shift; # Get the xml content if( $self->{settings}{xml} =~ /^\s*</ ) { $self->{doc} = $self->{settings}{xml}; } else { $self->{doc} = $self->_get_a_file( $self->{settings}{xml} ); } # end if return 0 unless defined $self->{doc}; delete $self->{settings}{xml}; # Just save some memory # -- Normalize the document to make things easier to find # Remove comments (but replace with spaces to maintain positioning for messages $self->{doc} =~ s/(<!--.+?-->)/' ' x length($1)/sge; # Move CDATA to hash and insert a reference to it (so it doesn't mess up regexp parsing) $self->{doc} =~ s/<!\[CDATA\[(.+?)\]\]>/'<![CDATA['.$self->_store_cdata($1).']]\/>'/sge; # Remove processing instructions (but replace with spaces to maintain positioning for messages # (Perhaps we could do something with these -- they are instructions for processors...) $self->{doc} =~ s/(<\?.+?\?>)/' ' x length($1)/sge; # NOTE: This makes it not possible to save the same formatting # -- will also remove the space from the <?xml ...?> processing instruction! if( $self->{doc} =~ s/^(\s+)// ) { $self->{doc_offset} = length $1; # Store the number of removed chars for messages } # end if $self->{doc} =~ s/\s+$//; # Build lookup tables $self->{elements} = {}; $self->{tree} = []; # - These are used in the building process
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -