html2text.java

来自「Eclipse的插件」· Java 代码 · 共 35 行

JAVA
35
字号
package samples.services.html2text;

public class HTML2Text {
    public static String convertHTML2Text(String html) {
	// simple algorithm, not intended for any real conversion

	// if we strip out anything between '<' and '>'
	// (inclusive of those characters) 
	// it works for many cases and is simple,
	// so we will do that

	StringBuffer cleanString = new StringBuffer();
	boolean tagsExist = true;
	int endOfLastTag = 0;
	while(tagsExist) {
	    int startOfNextTag = html.indexOf('<',endOfLastTag);
	    int endOfNextTag = html.indexOf('>',startOfNextTag);
	    if (startOfNextTag==-1 || endOfNextTag==-1) {
		// no more tags
		tagsExist = false;
	    } else {
		// we have a tag
		// copy the text from the end of the last tag
		// to the start of this tag
		// that is to say strip out the stuff within the tag itself
		cleanString.append(html.substring(endOfLastTag,startOfNextTag));
		endOfLastTag = endOfNextTag+1;
	    }
	}
	// add the text from the end of the last tag until the end of the string
	cleanString.append(html.substring(endOfLastTag));
	return cleanString.toString();
    }
}

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?