⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 renderer.java

📁 HTML解析器是一个Java库
💻 JAVA
📖 第 1 页 / 共 3 页
字号:
	 * @return this <code>Renderer</code> instance, allowing multiple property setting methods to be chained in a single statement. 
	 * @see #getConvertNonBreakingSpaces()
	 */
	public Renderer setConvertNonBreakingSpaces(boolean convertNonBreakingSpaces) {
		this.convertNonBreakingSpaces=convertNonBreakingSpaces;
		return this;
	}

	/**
	 * Indicates whether non-breaking space ({@link CharacterEntityReference#_nbsp &amp;nbsp;}) character entity references are converted to spaces.
	 * <p>
	 * See the {@link #setConvertNonBreakingSpaces(boolean)} method for a full description of this property.
	 * 
	 * @return <code>true</code> if non-breaking space ({@link CharacterEntityReference#_nbsp &amp;nbsp;}) character entity references are converted to spaces, otherwise <code>false</code>.
	 */
	public boolean getConvertNonBreakingSpaces() {
		return convertNonBreakingSpaces;
	}

	/**
	 * Sets the size of the indent to be used for anything other than {@link HTMLElementName#LI LI} elements.
	 * <p>
	 * At present this applies to {@link HTMLElementName#BLOCKQUOTE BLOCKQUOTE} and {@link HTMLElementName#DD DD} elements.
	 * <p>
	 * The default value is <code>4</code>.
	 * 
	 * @param blockIndentSize  the size of the indent.
	 * @return this <code>Renderer</code> instance, allowing multiple property setting methods to be chained in a single statement. 
	 * @see #getBlockIndentSize()
	 */
	public Renderer setBlockIndentSize(final int blockIndentSize) {
		this.blockIndentSize=blockIndentSize;
		return this;
	}

	/**
	 * Returns the size of the indent to be used for anything other than {@link HTMLElementName#LI LI} elements.
	 * <p>
	 * See the {@link #setBlockIndentSize(int)} method for a full description of this property.
	 *
	 * @return the size of the indent to be used for anything other than {@link HTMLElementName#LI LI} elements.
	 */
	public int getBlockIndentSize() {
		return blockIndentSize;
	}

	/**
	 * Sets the size of the indent to be used for {@link HTMLElementName#LI LI} elements.
	 * <p>
	 * The default value is <code>6</code>.
	 * <p>
	 * This applies to {@link HTMLElementName#LI LI} elements inside both {@link HTMLElementName#UL UL} and {@link HTMLElementName#OL OL} elements.
	 * <p>
	 * The bullet or number of the list item is included as part of the indent.
	 * 
	 * @param listIndentSize  the size of the indent.
	 * @return this <code>Renderer</code> instance, allowing multiple property setting methods to be chained in a single statement. 
	 * @see #getListIndentSize()
	 */
	public Renderer setListIndentSize(final int listIndentSize) {
		this.listIndentSize=listIndentSize;
		return this;
	}

	/**
	 * Returns the size of the indent to be used for {@link HTMLElementName#LI LI} elements.
	 * <p>
	 * See the {@link #setListIndentSize(int)} method for a full description of this property.
	 *
	 * @return the size of the indent to be used for {@link HTMLElementName#LI LI} elements.
	 */
	public int getListIndentSize() {
		return listIndentSize;
	}

	/**
	 * Sets the bullet characters to use for list items inside {@link HTMLElementName#UL UL} elements.
	 * <p>
	 * The values in the default array are <code>*</code>, <code>o</code>, <code>+</code> and <code>#</code>.
	 * <p>
	 * If the nesting of rendered lists goes deeper than the length of this array, the bullet characters start repeating from the first in the array.
	 * <p>
	 * WARNING: If any of the characters in the default array are modified, this will affect all other instances of this class using the default array.
	 * 
	 * @param listBullets  an array of characters to be used as bullets, must have at least one entry.
	 * @return this <code>Renderer</code> instance, allowing multiple property setting methods to be chained in a single statement. 
	 * @see #getListBullets()
	 */
	public Renderer setListBullets(final char[] listBullets) {
		if (listBullets==null || listBullets.length==0) throw new IllegalArgumentException("listBullets argument must be an array of at least one character");
		this.listBullets=listBullets;
		return this;
	}

	/**
	 * Returns the bullet characters to use for list items inside {@link HTMLElementName#UL UL} elements.
	 * <p>
	 * See the {@link #setListBullets(char[])} method for a full description of this property.
	 *
	 * @return the bullet characters to use for list items inside {@link HTMLElementName#UL UL} elements.
	 */
	public char[] getListBullets() {
		return listBullets;
	}

	/**
	 * Sets the string that is to separate table cells.
	 * <p>
	 * The default value is <code>" \t"</code> (a space followed by a tab).
	 * 
	 * @param tableCellSeparator  the string that is to separate table cells.
	 * @return this <code>Renderer</code> instance, allowing multiple property setting methods to be chained in a single statement. 
	 * @see #getTableCellSeparator()
	 */
	public Renderer setTableCellSeparator(final String tableCellSeparator) {
		this.tableCellSeparator=tableCellSeparator;
		return this;
	}

	/**
	 * Returns the string that is to separate table cells.
	 * <p>
	 * See the {@link #setTableCellSeparator(String)} method for a full description of this property.
	 *
	 * @return the string that is to separate table cells.
	 */
	public String getTableCellSeparator() {
		return tableCellSeparator;
	}
	
	/** This class does the actual work, but is first passed final copies of all the parameters for efficiency. */
	private static final class Processor {
		private final Renderer renderer;
		private final Segment rootSegment;
		private final Source source;
		private final int maxLineLength;
		private final String newLine;
		private final boolean includeHyperlinkURLs;
		private final boolean decorateFontStyles;
		private final boolean convertNonBreakingSpaces;
		private final int blockIndentSize;
		private final int listIndentSize;
		private final char[] listBullets;
		private final String tableCellSeparator;
	
		private Appendable appendable;
		private int renderedIndex; // keeps track of where rendering is up to in case of overlapping elements
		private boolean atStartOfLine;
		private int col;
		private int blockIndentLevel;
		private int listIndentLevel;
		private int blockVerticalMargin; // minimum number of blank lines to output at the current block boundary, or NO_MARGIN (-1) if we are not currently at a block boundary.
		private boolean preformatted;
		private boolean lastCharWhiteSpace;
		private boolean ignoreInitialWhitespace;
		private boolean bullet;
		private int listBulletNumber;
	
		private static final int NO_MARGIN=-1;
		private static final int UNORDERED_LIST=-1;
	
		private static Map<String,ElementHandler> ELEMENT_HANDLERS=new HashMap<String,ElementHandler>();
		static {
			ELEMENT_HANDLERS.put(HTMLElementName.A,A_ElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.ADDRESS,StandardBlockElementHandler.INSTANCE_0_0);
			ELEMENT_HANDLERS.put(HTMLElementName.APPLET,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.B,FontStyleElementHandler.INSTANCE_B);
			ELEMENT_HANDLERS.put(HTMLElementName.BLOCKQUOTE,StandardBlockElementHandler.INSTANCE_1_1_INDENT);
			ELEMENT_HANDLERS.put(HTMLElementName.BR,BR_ElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.BUTTON,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.CAPTION,StandardBlockElementHandler.INSTANCE_0_0);
			ELEMENT_HANDLERS.put(HTMLElementName.CENTER,StandardBlockElementHandler.INSTANCE_1_1);
			ELEMENT_HANDLERS.put(HTMLElementName.CODE,FontStyleElementHandler.INSTANCE_CODE);
			ELEMENT_HANDLERS.put(HTMLElementName.DD,StandardBlockElementHandler.INSTANCE_0_0_INDENT);
			ELEMENT_HANDLERS.put(HTMLElementName.DIR,ListElementHandler.INSTANCE_UL);
			ELEMENT_HANDLERS.put(HTMLElementName.DIV,StandardBlockElementHandler.INSTANCE_0_0);
			ELEMENT_HANDLERS.put(HTMLElementName.DT,StandardBlockElementHandler.INSTANCE_0_0);
			ELEMENT_HANDLERS.put(HTMLElementName.EM,FontStyleElementHandler.INSTANCE_I);
			ELEMENT_HANDLERS.put(HTMLElementName.FIELDSET,StandardBlockElementHandler.INSTANCE_1_1);
			ELEMENT_HANDLERS.put(HTMLElementName.FORM,StandardBlockElementHandler.INSTANCE_1_1);
			ELEMENT_HANDLERS.put(HTMLElementName.H1,StandardBlockElementHandler.INSTANCE_2_1);
			ELEMENT_HANDLERS.put(HTMLElementName.H2,StandardBlockElementHandler.INSTANCE_2_1);
			ELEMENT_HANDLERS.put(HTMLElementName.H3,StandardBlockElementHandler.INSTANCE_2_1);
			ELEMENT_HANDLERS.put(HTMLElementName.H4,StandardBlockElementHandler.INSTANCE_2_1);
			ELEMENT_HANDLERS.put(HTMLElementName.H5,StandardBlockElementHandler.INSTANCE_2_1);
			ELEMENT_HANDLERS.put(HTMLElementName.H6,StandardBlockElementHandler.INSTANCE_2_1);
			ELEMENT_HANDLERS.put(HTMLElementName.HEAD,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.HR,HR_ElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.I,FontStyleElementHandler.INSTANCE_I);
			ELEMENT_HANDLERS.put(HTMLElementName.LEGEND,StandardBlockElementHandler.INSTANCE_0_0);
			ELEMENT_HANDLERS.put(HTMLElementName.LI,LI_ElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.MENU,ListElementHandler.INSTANCE_UL);
			ELEMENT_HANDLERS.put(HTMLElementName.MAP,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.NOFRAMES,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.NOSCRIPT,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.OL,ListElementHandler.INSTANCE_OL);
			ELEMENT_HANDLERS.put(HTMLElementName.P,StandardBlockElementHandler.INSTANCE_1_1);
			ELEMENT_HANDLERS.put(HTMLElementName.PRE,PRE_ElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.SCRIPT,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.SELECT,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.STRONG,FontStyleElementHandler.INSTANCE_B);
			ELEMENT_HANDLERS.put(HTMLElementName.STYLE,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.TEXTAREA,RemoveElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.TD,TD_ElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.TH,TD_ElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.TR,TR_ElementHandler.INSTANCE);
			ELEMENT_HANDLERS.put(HTMLElementName.U,FontStyleElementHandler.INSTANCE_U);
			ELEMENT_HANDLERS.put(HTMLElementName.UL,ListElementHandler.INSTANCE_UL);
		}	
	
		public Processor(final Renderer renderer, final Segment rootSegment, final int maxLineLength, final String newLine, final boolean includeHyperlinkURLs, final boolean decorateFontStyles, final boolean convertNonBreakingSpaces, final int blockIndentSize, final int listIndentSize, final char[] listBullets, final String tableCellSeparator) {
			this.renderer=renderer;
			this.rootSegment=rootSegment;
			source=rootSegment.source;
			this.maxLineLength=maxLineLength;
			this.newLine=newLine;
			this.includeHyperlinkURLs=includeHyperlinkURLs;
			this.decorateFontStyles=decorateFontStyles;
			this.convertNonBreakingSpaces=convertNonBreakingSpaces;
			this.blockIndentSize=blockIndentSize;
			this.listIndentSize=listIndentSize;
			this.listBullets=listBullets;
			this.tableCellSeparator=tableCellSeparator;
		}
	
		public void appendTo(final Appendable appendable) throws IOException {
			reset();
			this.appendable=appendable;
			appendSegmentProcessingChildElements(rootSegment.begin,rootSegment.end,rootSegment.getChildElements());
		}
	
		private void reset() {
			renderedIndex=0;
			atStartOfLine=true;
			col=0;
			blockIndentLevel=0;
			listIndentLevel=0;
			blockVerticalMargin=NO_MARGIN;
			preformatted=false;
			lastCharWhiteSpace=ignoreInitialWhitespace=false;
			bullet=false;
		}
	
		private void appendElementContent(final Element element) throws IOException {
			final int contentEnd=element.getContentEnd();
			if (element.isEmpty() || renderedIndex>=contentEnd) return;
			final int contentBegin=element.getStartTag().end;
			appendSegmentProcessingChildElements(Math.max(renderedIndex,contentBegin),contentEnd,element.getChildElements());
		}
	
		private void appendSegmentProcessingChildElements(final int begin, final int end, final List<Element> childElements) throws IOException {
			int index=begin;
			for (Element childElement : childElements) {
				if (index>=childElement.end) continue;
				if (index<childElement.begin) appendSegmentRemovingTags(index,childElement.begin);
				getElementHandler(childElement).process(this,childElement);
				index=Math.max(renderedIndex,childElement.end);
			}
			if (index<end) appendSegmentRemovingTags(index,end);
		}
	
		private static ElementHandler getElementHandler(final Element element) {
			if (element.getStartTag().getStartTagType().isServerTag()) return RemoveElementHandler.INSTANCE; // hard-coded configuration does not include server tags in child element hierarchy, so this is normally not executed.
			ElementHandler elementHandler=ELEMENT_HANDLERS.get(element.getName());
			return (elementHandler!=null) ? elementHandler : StandardInlineElementHandler.INSTANCE;
		}
	
		private void appendSegmentRemovingTags(final int begin, final int end) throws IOException {
			int index=begin;
			while (true) {
				Tag tag=source.getNextTag(index);
				if (tag==null || tag.begin>=end) break;
				appendSegment(index,tag.begin);
				index=tag.end;
			}
			appendSegment(index,end);
		}

		private void appendSegment(int begin, final int end) throws IOException {
 			assert begin<=end;
			if (begin<renderedIndex) begin=renderedIndex;
			if (begin>=end) return;
			try {
				if (preformatted)
					appendPreformattedSegment(begin,end);
				else
					appendNonPreformattedSegment(begin,end);
			} finally {
				if (renderedIndex<end) renderedIndex=end;
			}
		}
	
		private void appendPreformattedSegment(final int begin, final int end) throws IOException {

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -