📄 xpath.class.inc
字号:
<?php
/**
* Php.XPath
*
* +======================================================================================================+
* | A php class for searching an XML document using XPath, and making modifications using a DOM
* | style API. Does not require the DOM XML PHP library.
* |
* +======================================================================================================+
* | What Is XPath:
* | --------------
* | - "What SQL is for a relational database, XPath is for an XML document." -- Sam Blum
* | - "The primary purpose of XPath is to address parts of an XML document. In support of this
* | primary purpose, it also provides basic facilities for manipulting it." -- W3C
* |
* | XPath in action and a very nice intro is under:
* | http://www.zvon.org/xxl/XPathTutorial/General/examples.html
* | Specs Can be found under:
* | http://www.w3.org/TR/xpath W3C XPath Recommendation
* | http://www.w3.org/TR/xpath20 W3C XPath Recommendation
* |
* | NOTE: Most of the XPath-spec has been realized, but not all. Usually this should not be
* | problem as the missing part is either rarely used or it's simpler to do with PHP itself.
* +------------------------------------------------------------------------------------------------------+
* | Requires PHP version 4.0.5 and up
* +------------------------------------------------------------------------------------------------------+
* | Main Active Authors:
* | --------------------
* | Nigel Swinson <nigelswinson@users.sourceforge.net>
* | Started around 2001-07, saved phpxml from near death and renamed to Php.XPath
* | Restructured XPath code to stay in line with XPath spec.
* | Sam Blum <bs_php@infeer.com>
* | Started around 2001-09 1st major restruct (V2.0) and testbench initiator.
* | 2nd (V3.0) major rewrite in 2002-02
* | Daniel Allen <bigredlinux@yahoo.com>
* | Started around 2001-10 working to make Php.XPath adhere to specs
* | Main Former Author: Michael P. Mehl <mpm@phpxml.org>
* | Inital creator of V 1.0. Stoped activities around 2001-03
* +------------------------------------------------------------------------------------------------------+
* | Code Structure:
* | --------------_
* | The class is split into 3 main objects. To keep usability easy all 3
* | objects are in this file (but may be split in 3 file in future).
* | +-------------+
* | | XPathBase | XPathBase holds general and debugging functions.
* | +------+------+
* | v
* | +-------------+ XPathEngine is the implementation of the W3C XPath spec. It contains the
* | | XPathEngine | XML-import (parser), -export and can handle xPathQueries. It's a fully
* | +------+------+ functional class but has no functions to modify the XML-document (see following).
* | v
* | +-------------+
* | | XPath | XPath extends the functionality with actions to modify the XML-document.
* | +-------------+ We tryed to implement a DOM - like interface.
* +------------------------------------------------------------------------------------------------------+
* | Usage:
* | ------
* | Scroll to the end of this php file and you will find a short sample code to get you started
* +------------------------------------------------------------------------------------------------------+
* | Glossary:
* | ---------
* | To understand how to use the functions and to pass the right parameters, read following:
* |
* | Document: (full node tree, XML-tree)
* | After a XML-source has been imported and parsed, it's stored as a tree of nodes sometimes
* | refered to as 'document'.
* |
* | AbsoluteXPath: (xPath, xPathSet)
* | A absolute XPath is a string. It 'points' to *one* node in the XML-document. We use the
* | term 'absolute' to emphasise that it is not an xPath-query (see xPathQuery). A valid xPath
* | has the form like '/AAA[1]/BBB[2]/CCC[1]'. Usually functions that require a node (see Node)
* | will also accept an abs. XPath.
* |
* | Node: (node, nodeSet, node-tree)
* | Some funtions require or return a node (or a whole node-tree). Nodes are only used with the
* | XPath-interface and have an internal structure. Every node in a XML document has a unique
* | corresponding abs. xPath. That's why public functions that accept a node, will usually also
* | accept a abs. xPath (a string) 'pointing' to an existing node (see absolutXPath).
* |
* | XPathQuery: (xquery, query)
* | A xPath-query is a string that is matched against the XML-document. The result of the match
* | is a xPathSet (vector of xPath's). It's always possible to pass a single absoluteXPath
* | instead of a xPath-query. A valid xPathQuery could look like this:
* | '//XXX/*[contains(., "foo")]/..' (See the link in 'What Is XPath' to learn more).
* |
* |
* +------------------------------------------------------------------------------------------------------+
* | Internals:
* | ----------
* | - The Node Tree
* | -------------
* | A central role of the package is how the XML-data is stored. The whole data is in a node-tree.
* | A node can be seen as the equvalent to a tag in the XML soure with some extra info.
* | For instance the following XML
* | <AAA foo="x">***<BBB/><CCC/>**<BBB/>*</AAA>
* | Would produce folowing node-tree:
* | 'super-root' <-- $nodeRoot (Very handy)
* | |
* | 'depth' 0 AAA[1] <-- top node. The 'textParts' of this node would be
* | / | \ 'textParts' => array('***','','**','*')
* | 'depth' 1 BBB[1] CCC[1] BBB[2] (NOTE: Is always size of child nodes+1)
* | - The Node
* | --------
* | The node itself is an structure desiged mainly to be used in connection with the interface of PHP.XPath.
* | That means it's possible for functions to return a sub-node-tree that can be used as input of an other
* | PHP.XPath function.
* |
* | The main structure of a node is:
* | $node = array(
* | 'name' => '', # The tag name. E.g. In <FOO bar="aaa"/> it would be 'FOO'
* | 'attributes' => array(), # The attributes of the tag E.g. In <FOO bar="aaa"/> it would be array('bar'=>'aaa')
* | 'textParts' => array(), # Array of text parts surrounding the children E.g. <FOO>aa<A>bb<B/>cc</A>dd</FOO> -> array('aa','bb','cc','dd')
* | 'childNodes' => array(), # Array of refences (pointers) to child nodes.
* |
* | For optimisation reasions some additional data is stored in the node too:
* | 'parentNode' => NULL # Reference (pointer) to the parent node (or NULL if it's 'super root')
* | 'depth' => 0, # The tag depth (or tree level) starting with the root tag at 0.
* | 'pos' => 0, # Is the zero-based position this node has in the parent's 'childNodes'-list.
* | 'contextPos' => 1, # Is the one-based position this node has by counting the siblings tags (tags with same name)
* | 'xpath' => '' # Is the abs. XPath to this node.
* | 'generated_id'=> '' # The id returned for this node by generate-id() (attribute and text nodes not supported)
* |
* | - The NodeIndex
* | -------------
* | Every node in the tree has an absolute XPath. E.g '/AAA[1]/BBB[2]' the $nodeIndex is a hash array
* | to all the nodes in the node-tree. The key used is the absolute XPath (a string).
* |
* +------------------------------------------------------------------------------------------------------+
* | License:
* | --------
* | The contents of this file are subject to the Mozilla Public License Version 1.1 (the "License");
* | you may not use this file except in compliance with the License. You may obtain a copy of the
* | License at http://www.mozilla.org/MPL/
* |
* | Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY
* | OF ANY KIND, either express or implied. See the License for the specific language governing
* | rights and limitations under the License.
* |
* | The Original Code is <phpXML/>.
* |
* | The Initial Developer of the Original Code is Michael P. Mehl. Portions created by Michael
* | P. Mehl are Copyright (C) 2001 Michael P. Mehl. All Rights Reserved.
* |
* | Contributor(s): N.Swinson / S.Blum / D.Allen
* |
* | Alternatively, the contents of this file may be used under the terms of either of the GNU
* | General Public License Version 2 or later (the "GPL"), or the GNU Lesser General Public
* | License Version 2.1 or later (the "LGPL"), in which case the provisions of the GPL or the
* | LGPL License are applicable instead of those above. If you wish to allow use of your version
* | of this file only under the terms of the GPL or the LGPL License and not to allow others to
* | use your version of this file under the MPL, indicate your decision by deleting the
* | provisions above and replace them with the notice and other provisions required by the
* | GPL or the LGPL License. If you do not delete the provisions above, a recipient may use
* | your version of this file under either the MPL, the GPL or the LGPL License.
* |
* +======================================================================================================+
*
* @author S.Blum / N.Swinson / D.Allen / (P.Mehl)
* @link http://sourceforge.net/projects/phpxpath/
* @version 3.5
* @CVS $Id: xpath.class.inc,v 1.3 2005/05/18 09:30:25 mschering Exp $
*/
// Include guard, protects file being included twice
$ConstantName = 'INCLUDED_'.strtoupper(__FILE__);
if (defined($ConstantName)) return;
define($ConstantName,1, TRUE);
/************************************************************************************************
* ===============================================================================================
* X P a t h B a s e - Class
* ===============================================================================================
************************************************************************************************/
class XPathBase {
var $_lastError;
// As debugging of the xml parse is spread across several functions, we need to make this a member.
var $bDebugXmlParse = FALSE;
// Used to help navigate through the begin/end debug calls
var $iDebugNextLinkNumber = 1;
var $aDebugOpenLinks = array();
var $aDebugFunctions = array(
//'_evaluateStep',
//'_evaluatePrimaryExpr',
//'_evaluateExpr',
//'_evaluateStep',
//'_checkPredicates',
//'_evaluateFunction',
//'_evaluateOperator',
//'_evaluatePathExpr',
);
/**
* Constructor
*/
function XPathBase() {
# $this->bDebugXmlParse = TRUE;
$this->properties['verboseLevel'] = 1; // 0=silent, 1 and above produce verbose output (an echo to screen).
if (!isSet($_ENV)) { // Note: $_ENV introduced in 4.1.0. In earlier versions, use $HTTP_ENV_VARS.
$_ENV = $GLOBALS['HTTP_ENV_VARS'];
}
// Windows 95/98 do not support file locking. Detecting OS (Operation System) and setting the
// properties['OS_supports_flock'] to FALSE if win 95/98 is detected.
// This will surpress the file locking error reported from win 98 users when exportToFile() is called.
// May have to add more OS's to the list in future (Macs?).
// ### Note that it's only the FAT and NFS file systems that are really a problem. NTFS and
// the latest php libs do support flock()
$_ENV['OS'] = isSet($_ENV['OS']) ? $_ENV['OS'] : 'Unknown OS';
switch ($_ENV['OS']) {
case 'Windows_95':
case 'Windows_98':
case 'Unknown OS':
// should catch Mac OS X compatible environment
if (!empty($_SERVER['SERVER_SOFTWARE'])
&& preg_match('/Darwin/',$_SERVER['SERVER_SOFTWARE'])) {
// fall-through
} else {
$this->properties['OS_supports_flock'] = FALSE;
break;
}
default:
$this->properties['OS_supports_flock'] = TRUE;
}
}
/**
* Resets the object so it's able to take a new xml sting/file
*
* Constructing objects is slow. If you can, reuse ones that you have used already
* by using this reset() function.
*/
function reset() {
$this->_lastError = '';
}
//-----------------------------------------------------------------------------------------
// XPathBase ------ Helpers ------
//-----------------------------------------------------------------------------------------
/**
* This method checks the right amount and match of brackets
*
* @param $term (string) String in which is checked.
* @return (bool) TRUE: OK / FALSE: KO
*/
function _bracketsCheck($term) {
$leng = strlen($term);
$brackets = 0;
$bracketMisscount = $bracketMissmatsh = FALSE;
$stack = array();
for ($i=0; $i<$leng; $i++) {
switch ($term[$i]) {
case '(' :
case '[' :
$stack[$brackets] = $term[$i];
$brackets++;
break;
case ')':
$brackets--;
if ($brackets<0) {
$bracketMisscount = TRUE;
break 2;
}
if ($stack[$brackets] != '(') {
$bracketMissmatsh = TRUE;
break 2;
}
break;
case ']' :
$brackets--;
if ($brackets<0) {
$bracketMisscount = TRUE;
break 2;
}
if ($stack[$brackets] != '[') {
$bracketMissmatsh = TRUE;
break 2;
}
break;
}
}
// Check whether we had a valid number of brackets.
if ($brackets != 0) $bracketMisscount = TRUE;
if ($bracketMisscount || $bracketMissmatsh) {
return FALSE;
}
return TRUE;
}
/**
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -