⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 tests.cc

📁 著名的标准C++的html解析器
💻 CC
字号:
#include <string>#include <cstdio>#include <iostream>#include <fstream>#include "CharsetConverter.h"#include "Uri.h"#include "ParserDom.h"#include "utils.h"using namespace std;using namespace htmlcxx;#define myassert(x) \	do {\		if(!(x)) {\			fprintf(stderr, "Test at %s:%d failed!\n", __FILE__, __LINE__);\			exit(1);\		}\	} while(0)bool my_tree_compare(tree<HTML::Node>::iterator begin, tree<HTML::Node>::iterator end, tree<HTML::Node>::iterator ref){	tree<HTML::Node>::iterator it(begin);	while (it != end && ref != end)	{		if (it.number_of_children() != ref.number_of_children())			return false;		if (it->text() != ref->text())			return false;		++it;		++ref;	}	return true;}class HtmlTest {	public:	bool parse() {		cerr << "Parsing some html... ";		tree<HTML::Node> tr;		string html = "<head></head><body>\n\n\n\n<center>\n<table width=\"600\">\n<tbody><tr>\n<td width=\"120\"><a href=\"/index.html\"><img src=\"/adt-SUA/images/ADT_LOGO.gif\" alt=\"adt logo\" align=\"middle\" border=\"0\"></a></td>\n<td width=\"480\"><font size=\"+2\" face=\"helvetica,arial\"><b>Australian Digital Theses Program<br></b></font></td>\n</tr>\n</tbody></table>\n</center>\n<center>\n</center>\n</body>";		HTML::ParserDom parser;		parser.parse(html);		tr = parser.getTree();		cerr << tr << endl;		cerr << " ok" << endl;		return true;	}	bool string_manip() {		string root_link = "http://www.akwan.com.br/teste/acton.asp?q=atletico";		string root_link2 = "http://answerbook.ime.usp.br:8888/ab2";		string link1 = "../a.html";		string link2 = "//b.html";		string link3 = "servi&#231;o.html";		string link4 = "./d/c.html";		string link5 = "http://www.fadazan.com.br/../../../../../Download/teste/../jacobmacanhan,%203276.jpg";		string link6 = "search?q=galo";		string link7 = "http://casadebruxa.com.br/anuncio/../banner/vai.asp?id=21&url=http://www.clickdirect.com.br";		string link8 = "/ab2/Help_C/ONLINEACCESS/@Ab2HelpView/idmatch(help-library-info)";		string link9 = "/ab2/coll.67.3/@Ab2CollView?";		string link10 = "http://www.a.com.br";		string link11 = "'http://www.b.com.br";		string link12 = "?q=mineiro";		string entities = "nos somos do clube atletico mineiro &#225; &aacute; brasil &nbsp; &teste; &atilde;&auml; &aacute &acirc; &end   ";		string comments = "hello <!-- world --> brazil";		string multiblank = "  1 2  3\r\n   4    5  \r\n  6  \n";		string justblank = "     \r\n         \r\n    \n";		string nonblank = "dsadasdada";		myassert(HTML::strip_comments(comments) == "hello  brazil");		myassert(HTML::single_blank(multiblank) == "1 2 3 4 5 6");		myassert(HTML::single_blank(justblank) == "");		myassert(HTML::single_blank(nonblank) == nonblank);		myassert(HTML::decode_entities(entities) == "nos somos do clube atletico mineiro 

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -