⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 htmlnode.java

📁 java html 解析小程序,文件包很小
💻 JAVA
📖 第 1 页 / 共 2 页
字号:
/* * HTML Parser * Copyright (C) 1997 David McNicol * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * file COPYING for more details. */package cvu.html;import java.util.Hashtable;import java.util.Vector;import java.util.Enumeration;import java.io.DataOutputStream;import java.io.IOException;/** * This class represents a single node within an HTML tree. Each node * has a name, zero or more attributes and possibly some content. Nodes * can appear within the content of other nodes. <p> * End tags do not appear since they only indicate 'end-of-content'. To * prevent the system searching for the end of standalone tags, a dynamic * list has been implemented. When the HTMLNode class is resolved * a setup method is called adding a set of default standalone tags * to the list. Standalone tags can then be added and removed dynamically * using static method calls. <p> * The list is the only way the internal code can tell * whether a tag is standalone. If a problem occurs the tree structure * would still be sound, but it would not be accurate, so while the form * of the HTML would be conserved, searches would not operate correctly. * @see HTMLTree * @author <a href="http://www.strath.ac.uk/~ras97108/">David McNicol</a> */public class HTMLNode {	private HTMLNode parent;    // Refers to this node's parent.	private String name;	    // Stores the name of the HTML node.	private AttributeList attr; // List of element's attributes.	private Vector children;    // Stores the HTML node's children.	private boolean hidden;     // True if the node is not to be printed.	/**	 * Constructs a new HTMLNode.	 * @param tag the TagToken representing the start of this node.	 * @param standalone true if the tag does not have any content.	 * @param src enumeration of tag tokens.	 */	public HTMLNode (TagToken tag, HTMLNode parent, Enumeration src) {		// Store the reference to the node's parent.		this.parent = parent;		// Set the node to be unhidden by default.		hidden = false;		// Check if the given tag is null.		if (tag != null) {			// Store the node's name.			name = tag.getName();			// Store the node's attribute list.			attr = tag.getAttributes();			// Get the node's children if needed.			if (HTMLNode.isStandalone(name))				children = null;			else				children = parseChildren(src);		} else {			// Otherwise, set the name and attributes to null.			name = null;			attr = null;			// Get the node's children from the enumeration.			children = parseChildren(src);		}	}	/**	 * Constructs a new, detached HTMLNode with the specified name.	 * @param name the name of the new node.	 */	public HTMLNode (String name) {				// Store the name of the node.			this.name = name;		// The node will have no parent till it is added to a tree.		parent = null;		// Create a new attribute list.		attr = new AttributeList();		// Create space for children if the node is not standalone.		if (HTMLNode.isStandalone(name))			children = null;		else			children = new Vector();	}	/**	 * Returns the name of this node.	 */	public String getName () {		return name;	}	/**	 * Returns the node's parent node.	 */	public HTMLNode getParent () {		return parent;	}	/**	 * Returns the node's children.	 */	public Enumeration getChildren () {		// Return nothing if the node has any children.		if (children == null) return null;		return children.elements();	}	/**	 * Returns true if the node is currently hidden.	 */	public boolean isHidden () {		return hidden;	}	/**	 * Hides the node.	 */	public void hide () {		hidden = true;	}	/**	 * "Unhides" the node.	 */	public void unhide () {		hidden = false;	}	/**	 * Returns the value of the attribute with the given name.	 * @param name the name of the attribute.	 */	public String getAttribute (String name) {				// Check that the attribute list is there.		if (attr == null) return null;		// Return the value associated with the attribute name.		return (String) attr.get(name);	}	/**	 * Returns an enumeration of attributes defined in this node.	 */	public Enumeration getAttributes () {		// Check that the attribute list has been defined.		if (attr == null) return null;		// Return an enumeration of all of the attribute names.		return attr.names();	}	/**	 * Returns an attribute with all double quote characters	 * escaped with a backslash.	 * @param name the name of the attribute.	 */	public String getQuotedAttribute (String name) {		// Check that the attribute list is there.		if (attr == null) return null;		// Return the quoted version.		return attr.getQuoted(name);	}	/**	 * Returns a string version of the attribute and its value.	 * @param name the name of the attribute.	 */	public String getAttributeToString (String name) {		// Check that the attribute list is there.		if (attr == null) return null;		// Return the string version.		return attr.toString(name);	}	/**	 * Returns a string version of the HTMLNode. If the node is 	 * currently hidden then return an empty string.	 */	public String toString () {		StringBuffer sb;  // Stores the string to be returned.		Enumeration list; // List of node's attributes or children.		// Get a new StringBuffer.		sb = new StringBuffer();		if (! hidden) {			// Write the opening of the tag.			sb.append('<');			// Write the tag's name.			sb.append(name);			// Check if there are any attributes.			if (attr != null && attr.size() > 0) {				// Print string version of the attributes.				sb.append(" " + attr);			}			// Finish off the tag.			sb.append('>');		}		// Return if the node is standalone.		if (isStandalone(name)) return sb.toString();		// Otherwise, check if the node has any children.		if (children != null && children.size() > 0) {			// Get a list of all of the children.			list = children.elements();			while (list.hasMoreElements()) {				// Get the next node from the list.				Object o = list.nextElement();								// Write it.				sb.append(o.toString());			}		}		if (! hidden) {			// Write the end tag.			sb.append("</").append(name).append(">");		}		// Return the string version.		return sb.toString();	}	/**	 * Sets the node's parent to the specified HTMLNode.	 * @param parent the new parent.	 */	public void setParent (HTMLNode parent) {		this.parent = parent;	}	/**	 * Returns true if an attribute with the given name exists.	 * @param name the name of the attribute.	 */	public boolean isAttribute (String name) {			// Check that the attribute list is there.		if (attr == null) return false;		// Check the table for an attribute with that name.		return attr.exists(name);	}	/**	 * Adds a new attribute to the node's attribute list with	 * the specified value. If the attribute already exists the	 * old value is overwritten.	 * @param name the name of the attribute.	 * @param value the value of the attribute.	 */	public void addAttribute (String name, String value) {		// Return if the attribute list is not there.		if (attr == null) return;		// Otherwise, add the name/value pair to the list.		attr.set(name, value);	}	/**	 * Adds an object to the end of this node's content	 * @param child the node to be added.	 */	public void addChild (Object child) {		// Return if the child is invalid.		if (child == null) return;		// Check that this node has no children.		if (children == null) return;		// Add the child if it is a string.		if (child instanceof String) {			children.addElement(child);			return;		}		// Add the child and set its parent if it is an HTMLNode.		if (child instanceof HTMLNode) {			children.addElement(child);			((HTMLNode) child).setParent(this);			return;		}	}	/**	 * Removes the specified HTMLNode from the current node's	 * list of children.	 * @param child the node to be removed.	 */	public void removeChild (HTMLNode child) {		// Return if the child is not defined properly		if (child == null) return;		// Return if the list of children is not defined properly.		if (children == null) return;		// Otherwise, remove the child if it is on the list.		children.removeElement(child);	}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -