📄 xpath.class.php
字号:
<?php/** * Php.XPath * * +======================================================================================================+ * | A php class for searching an XML document using XPath, and making modifications using a DOM * | style API. Does not require the DOM XML PHP library. * | * +======================================================================================================+ * | What Is XPath: * | -------------- * | - "What SQL is for a relational database, XPath is for an XML document." -- Sam Blum * | - "The primary purpose of XPath is to address parts of an XML document. In support of this * | primary purpose, it also provides basic facilities for manipulting it." -- W3C * | * | XPath in action and a very nice intro is under: * | http://www.zvon.org/xxl/XPathTutorial/General/examples.html * | Specs Can be found under: * | http://www.w3.org/TR/xpath W3C XPath Recommendation * | http://www.w3.org/TR/xpath20 W3C XPath Recommendation * | * | NOTE: Most of the XPath-spec has been realized, but not all. Usually this should not be * | problem as the missing part is either rarely used or it's simpler to do with PHP itself. * +------------------------------------------------------------------------------------------------------+ * | Requires PHP version 4.0.5 and up * +------------------------------------------------------------------------------------------------------+ * | Main Active Authors: * | -------------------- * | Nigel Swinson <nigelswinson@users.sourceforge.net> * | Started around 2001-07, saved phpxml from near death and renamed to Php.XPath * | Restructured XPath code to stay in line with XPath spec. * | Sam Blum <bs_php@infeer.com> * | Started around 2001-09 1st major restruct (V2.0) and testbench initiator. * | 2nd (V3.0) major rewrite in 2002-02 * | Daniel Allen <bigredlinux@yahoo.com> * | Started around 2001-10 working to make Php.XPath adhere to specs * | Main Former Author: Michael P. Mehl <mpm@phpxml.org> * | Inital creator of V 1.0. Stoped activities around 2001-03 * +------------------------------------------------------------------------------------------------------+ * | Code Structure: * | --------------_ * | The class is split into 3 main objects. To keep usability easy all 3 * | objects are in this file (but may be split in 3 file in future). * | +-------------+ * | | XPathBase | XPathBase holds general and debugging functions. * | +------+------+ * | v * | +-------------+ XPathEngine is the implementation of the W3C XPath spec. It contains the * | | XPathEngine | XML-import (parser), -export and can handle xPathQueries. It's a fully * | +------+------+ functional class but has no functions to modify the XML-document (see following). * | v * | +-------------+ * | | XPath | XPath extends the functionality with actions to modify the XML-document. * | +-------------+ We tryed to implement a DOM - like interface. * +------------------------------------------------------------------------------------------------------+ * | Usage: * | ------ * | Scroll to the end of this php file and you will find a short sample code to get you started * +------------------------------------------------------------------------------------------------------+ * | Glossary: * | --------- * | To understand how to use the functions and to pass the right parameters, read following: * | * | Document: (full node tree, XML-tree) * | After a XML-source has been imported and parsed, it's stored as a tree of nodes sometimes * | refered to as 'document'. * | * | AbsoluteXPath: (xPath, xPathSet) * | A absolute XPath is a string. It 'points' to *one* node in the XML-document. We use the * | term 'absolute' to emphasise that it is not an xPath-query (see xPathQuery). A valid xPath * | has the form like '/AAA[1]/BBB[2]/CCC[1]'. Usually functions that require a node (see Node) * | will also accept an abs. XPath. * | * | Node: (node, nodeSet, node-tree) * | Some funtions require or return a node (or a whole node-tree). Nodes are only used with the * | XPath-interface and have an internal structure. Every node in a XML document has a unique * | corresponding abs. xPath. That's why public functions that accept a node, will usually also * | accept a abs. xPath (a string) 'pointing' to an existing node (see absolutXPath). * | * | XPathQuery: (xquery, query) * | A xPath-query is a string that is matched against the XML-document. The result of the match * | is a xPathSet (vector of xPath's). It's always possible to pass a single absoluteXPath * | instead of a xPath-query. A valid xPathQuery could look like this: * | '//XXX/*[contains(., "foo")]/..' (See the link in 'What Is XPath' to learn more). * | * | * +------------------------------------------------------------------------------------------------------+ * | Internals: * | ---------- * | - The Node Tree * | ------------- * | A central role of the package is how the XML-data is stored. The whole data is in a node-tree. * | A node can be seen as the equvalent to a tag in the XML soure with some extra info. * | For instance the following XML * | <AAA foo="x">***<BBB/><CCC/>**<BBB/>*</AAA> * | Would produce folowing node-tree: * | 'super-root' <-- $nodeRoot (Very handy) * | | * | 'depth' 0 AAA[1] <-- top node. The 'textParts' of this node would be * | / | \ 'textParts' => array('***','','**','*') * | 'depth' 1 BBB[1] CCC[1] BBB[2] (NOTE: Is always size of child nodes+1) * | - The Node * | -------- * | The node itself is an structure desiged mainly to be used in connection with the interface of PHP.XPath. * | That means it's possible for functions to return a sub-node-tree that can be used as input of an other * | PHP.XPath function. * | * | The main structure of a node is: * | $node = array( * | 'name' => '', # The tag name. E.g. In <FOO bar="aaa"/> it would be 'FOO' * | 'attributes' => array(), # The attributes of the tag E.g. In <FOO bar="aaa"/> it would be array('bar'=>'aaa') * | 'textParts' => array(), # Array of text parts surrounding the children E.g. <FOO>aa<A>bb<B/>cc</A>dd</FOO> -> array('aa','bb','cc','dd') * | 'childNodes' => array(), # Array of refences (pointers) to child nodes. * | * | For optimisation reasions some additional data is stored in the node too: * | 'parentNode' => NULL # Reference (pointer) to the parent node (or NULL if it's 'super root') * | 'depth' => 0, # The tag depth (or tree level) starting with the root tag at 0. * | 'pos' => 0, # Is the zero-based position this node has in the parent's 'childNodes'-list. * | 'contextPos' => 1, # Is the one-based position this node has by counting the siblings tags (tags with same name) * | 'xpath' => '' # Is the abs. XPath to this node. * | 'generated_id'=> '' # The id returned for this node by generate-id() (attribute and text nodes not supported) * | * | - The NodeIndex * | ------------- * | Every node in the tree has an absolute XPath. E.g '/AAA[1]/BBB[2]' the $nodeIndex is a hash array * | to all the nodes in the node-tree. The key used is the absolute XPath (a string). * | * +------------------------------------------------------------------------------------------------------+ * | License: * | -------- * | The contents of this file are subject to the Mozilla Public License Version 1.1 (the "License"); * | you may not use this file except in compliance with the License. You may obtain a copy of the * | License at http://www.mozilla.org/MPL/ * | * | Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY * | OF ANY KIND, either express or implied. See the License for the specific language governing * | rights and limitations under the License. * | * | The Original Code is <phpXML/>. * | * | The Initial Developer of the Original Code is Michael P. Mehl. Portions created by Michael * | P. Mehl are Copyright (C) 2001 Michael P. Mehl. All Rights Reserved. * | * | Contributor(s): N.Swinson / S.Blum / D.Allen * | * | Alternatively, the contents of this file may be used under the terms of either of the GNU * | General Public License Version 2 or later (the "GPL"), or the GNU Lesser General Public * | License Version 2.1 or later (the "LGPL"), in which case the provisions of the GPL or the * | LGPL License are applicable instead of those above. If you wish to allow use of your version * | of this file only under the terms of the GPL or the LGPL License and not to allow others to * | use your version of this file under the MPL, indicate your decision by deleting the * | provisions above and replace them with the notice and other provisions required by the * | GPL or the LGPL License. If you do not delete the provisions above, a recipient may use * | your version of this file under either the MPL, the GPL or the LGPL License. * | * +======================================================================================================+ * * @author S.Blum / N.Swinson / D.Allen / (P.Mehl) * @link http://sourceforge.net/projects/phpxpath/ * @version 3.4 * @CVS $Id: XPath.class.php,v 1.1 2005/04/15 21:23:31 mschering Exp $ *//************************************************************************************************* ===============================================================================================* X P a t h B a s e - Class * ===============================================================================================************************************************************************************************/class XPathBase { var $_lastError; // As debugging of the xml parse is spread across several functions, we need to make this a member. var $bDebugXmlParse = FALSE; // Used to help navigate through the begin/end debug calls var $iDebugNextLinkNumber = 1; var $aDebugOpenLinks = array(); /** * Constructor */ function XPathBase() { # $this->bDebugXmlParse = TRUE; $this->properties['verboseLevel'] = 1; // 0=silent, 1 and above produce verbose output (an echo to screen). if (!isSet($_ENV)) { // Note: $_ENV introduced in 4.1.0. In earlier versions, use $HTTP_ENV_VARS. $_ENV = $GLOBALS['HTTP_ENV_VARS']; } // Windows 95/98 do not support file locking. Detecting OS (Operation System) and setting the // properties['OS_supports_flock'] to FALSE if win 95/98 is detected. // This will surpress the file locking error reported from win 98 users when exportToFile() is called. // May have to add more OS's to the list in future (Macs?). // ### Note that it's only the FAT and NFS file systems that are really a problem. NTFS and // the latest php libs do support flock() $_ENV['OS'] = isSet($_ENV['OS']) ? $_ENV['OS'] : 'Unknown OS'; switch ($_ENV['OS']) { case 'Windows_95': case 'Windows_98': case 'Unknown OS': // should catch Mac OS X compatible environment if (preg_match('/Darwin/',$_SERVER['SERVER_SOFTWARE'])) { // fall-through } else { $this->properties['OS_supports_flock'] = FALSE; break; } default: $this->properties['OS_supports_flock'] = TRUE; } } /** * Resets the object so it's able to take a new xml sting/file * * Constructing objects is slow. If you can, reuse ones that you have used already * by using this reset() function. */ function reset() { $this->_lastError = ''; } //----------------------------------------------------------------------------------------- // XPathBase ------ Helpers ------ //----------------------------------------------------------------------------------------- /** * This method checks the right amount and match of brackets * * @param $term (string) String in which is checked. * @return (bool) TRUE: OK / FALSE: KO */ function _bracketsCheck($term) { $leng = strlen($term); $brackets = 0; $bracketMisscount = $bracketMissmatsh = FALSE; $stack = array(); for ($i=0; $i<$leng; $i++) { switch ($term[$i]) { case '(' : case '[' : $stack[$brackets] = $term[$i]; $brackets++; break; case ')': $brackets--; if ($brackets<0) { $bracketMisscount = TRUE; break 2; } if ($stack[$brackets] != '(') { $bracketMissmatsh = TRUE; break 2; } break; case ']' : $brackets--; if ($brackets<0) { $bracketMisscount = TRUE; break 2; } if ($stack[$brackets] != '[') { $bracketMissmatsh = TRUE; break 2; } break; } } // Check whether we had a valid number of brackets. if ($brackets != 0) $bracketMisscount = TRUE; if ($bracketMisscount || $bracketMissmatsh) { return FALSE; } return TRUE; } /** * Looks for a string within another string -- BUT the search-string must be located *outside* of any brackets. * * This method looks for a string within another string. Brackets in the * string the method is looking through will be respected, which means that * only if the string the method is looking for is located outside of * brackets, the search will be successful. * * @param $term (string) String in which the search shall take place. * @param $expression (string) String that should be searched. * @return (int) This method returns -1 if no string was found, * otherwise the offset at which the string was found. */ function _searchString($term, $expression) { $bracketCounter = 0; // Record where we are in the brackets. $leng = strlen($term); $exprLeng = strlen($expression); for ($i=0; $i<$leng; $i++) { $char = $term[$i]; if ($char=='(' || $char=='[') { $bracketCounter++; continue; } elseif ($char==')' || $char==']') { $bracketCounter--;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -