⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 libxml2-htmlparser.html

📁 xml开源解析代码.版本为libxml2-2.6.29,可支持GB3212.网络消息发送XML时很有用.
💻 HTML
📖 第 1 页 / 共 4 页
字号:
    <a name="HTML_PARSE_NOERROR">HTML_PARSE_NOERROR</a> = 32 /* suppress error reports */    <a name="HTML_PARSE_NOWARNING">HTML_PARSE_NOWARNING</a> = 64 /* suppress warning reports */    <a name="HTML_PARSE_PEDANTIC">HTML_PARSE_PEDANTIC</a> = 128 /* pedantic error reporting */    <a name="HTML_PARSE_NOBLANKS">HTML_PARSE_NOBLANKS</a> = 256 /* remove blank nodes */    <a name="HTML_PARSE_NONET">HTML_PARSE_NONET</a> = 2048 /* Forbid network access */    <a name="HTML_PARSE_COMPACT">HTML_PARSE_COMPACT</a> = 65536 /*  compact small text nodes */};</pre><p/></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlSAXHandler">Typedef </a>htmlSAXHandler</h3><pre class="programlisting"><a href="libxml2-tree.html#xmlSAXHandler">xmlSAXHandler</a> htmlSAXHandler;</pre><p/></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlSAXHandlerPtr">Typedef </a>htmlSAXHandlerPtr</h3><pre class="programlisting"><a href="libxml2-tree.html#xmlSAXHandlerPtr">xmlSAXHandlerPtr</a> htmlSAXHandlerPtr;</pre><p/></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlStatus">Enum </a>htmlStatus</h3><pre class="programlisting">enum <a href="#htmlStatus">htmlStatus</a> {    <a name="HTML_NA">HTML_NA</a> = 0 /* something we don't check at all */    <a name="HTML_INVALID">HTML_INVALID</a> = 1    <a name="HTML_DEPRECATED">HTML_DEPRECATED</a> = 2    <a name="HTML_VALID">HTML_VALID</a> = 4    <a name="HTML_REQUIRED">HTML_REQUIRED</a> = 12 /*  VALID bit set so ( &amp; HTML_VALID ) is TRUE */};</pre><p/></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="UTF8ToHtml"/>UTF8ToHtml ()</h3><pre class="programlisting">int	UTF8ToHtml			(unsigned char * out, <br/>					 int * outlen, <br/>					 const unsigned char * in, <br/>					 int * inlen)<br/></pre><p>Take a block of UTF-8 chars in and try to convert it to an ASCII plus HTML entities block of chars out.</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>out</tt></i>:</span></td><td>a pointer to an array of bytes to store the result</td></tr><tr><td><span class="term"><i><tt>outlen</tt></i>:</span></td><td>the length of @out</td></tr><tr><td><span class="term"><i><tt>in</tt></i>:</span></td><td>a pointer to an array of UTF-8 chars</td></tr><tr><td><span class="term"><i><tt>inlen</tt></i>:</span></td><td>the length of @in</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>0 if success, -2 if the transcoding fails, or -1 otherwise The value of @inlen after return is the number of octets consumed as the return value is positive, else unpredictable. The value of @outlen after return is the number of octets consumed.</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlAttrAllowed"/>htmlAttrAllowed ()</h3><pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlStatus">htmlStatus</a>	htmlAttrAllowed		(const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * elt, <br/>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * attr, <br/>					 int legacy)<br/></pre><p>Checks whether an <a href="libxml2-SAX.html#attribute">attribute</a> is valid for an element Has full knowledge of Required and Deprecated attributes</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>elt</tt></i>:</span></td><td>HTML element</td></tr><tr><td><span class="term"><i><tt>attr</tt></i>:</span></td><td>HTML <a href="libxml2-SAX.html#attribute">attribute</a></td></tr><tr><td><span class="term"><i><tt>legacy</tt></i>:</span></td><td>whether to allow deprecated attributes</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>one of HTML_REQUIRED, HTML_VALID, HTML_DEPRECATED, <a href="libxml2-HTMLparser.html#HTML_INVALID">HTML_INVALID</a></td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlAutoCloseTag"/>htmlAutoCloseTag ()</h3><pre class="programlisting">int	htmlAutoCloseTag		(<a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a> doc, <br/>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * name, <br/>					 <a href="libxml2-HTMLparser.html#htmlNodePtr">htmlNodePtr</a> elem)<br/></pre><p>The HTML DTD allows a tag to implicitly close other tags. The list is kept in htmlStartClose array. This function checks if the element or one of it's children would autoclose the given tag.</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>doc</tt></i>:</span></td><td>the HTML document</td></tr><tr><td><span class="term"><i><tt>name</tt></i>:</span></td><td>The tag name</td></tr><tr><td><span class="term"><i><tt>elem</tt></i>:</span></td><td>the HTML element</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>1 if autoclose, 0 otherwise</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlCreateMemoryParserCtxt"/>htmlCreateMemoryParserCtxt ()</h3><pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	htmlCreateMemoryParserCtxt	(const char * buffer, <br/>							 int size)<br/></pre><p>Create a parser context for an HTML in-memory document.</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>buffer</tt></i>:</span></td><td>a pointer to a char array</td></tr><tr><td><span class="term"><i><tt>size</tt></i>:</span></td><td>the size of the array</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>the new parser context or NULL</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlCreatePushParserCtxt"/>htmlCreatePushParserCtxt ()</h3><pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a>	htmlCreatePushParserCtxt	(<a href="libxml2-HTMLparser.html#htmlSAXHandlerPtr">htmlSAXHandlerPtr</a> sax, <br/>							 void * user_data, <br/>							 const char * chunk, <br/>							 int size, <br/>							 const char * filename, <br/>							 <a href="libxml2-encoding.html#xmlCharEncoding">xmlCharEncoding</a> enc)<br/></pre><p>Create a parser context for using the HTML parser in push mode The value of @filename is used for fetching external entities and error/warning reports.</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>sax</tt></i>:</span></td><td>a SAX handler</td></tr><tr><td><span class="term"><i><tt>user_data</tt></i>:</span></td><td>The user data returned on SAX callbacks</td></tr><tr><td><span class="term"><i><tt>chunk</tt></i>:</span></td><td>a pointer to an array of chars</td></tr><tr><td><span class="term"><i><tt>size</tt></i>:</span></td><td>number of chars in the array</td></tr><tr><td><span class="term"><i><tt>filename</tt></i>:</span></td><td>an optional file name or URI</td></tr><tr><td><span class="term"><i><tt>enc</tt></i>:</span></td><td>an optional encoding</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>the new parser context or NULL</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlCtxtReadDoc"/>htmlCtxtReadDoc ()</h3><pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadDoc		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br/>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * cur, <br/>					 const char * URL, <br/>					 const char * encoding, <br/>					 int options)<br/></pre><p>parse an XML in-memory document and build a tree. This reuses the existing @ctxt parser context</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>ctxt</tt></i>:</span></td><td>an HTML parser context</td></tr><tr><td><span class="term"><i><tt>cur</tt></i>:</span></td><td>a pointer to a zero terminated string</td></tr><tr><td><span class="term"><i><tt>URL</tt></i>:</span></td><td>the base URL to use for the document</td></tr><tr><td><span class="term"><i><tt>encoding</tt></i>:</span></td><td>the document encoding, or NULL</td></tr><tr><td><span class="term"><i><tt>options</tt></i>:</span></td><td>a combination of htmlParserOption(s)</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>the resulting document tree</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlCtxtReadFd"/>htmlCtxtReadFd ()</h3><pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadFd		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br/>					 int fd, <br/>					 const char * URL, <br/>					 const char * encoding, <br/>					 int options)<br/></pre><p>parse an XML from a file descriptor and build a tree. This reuses the existing @ctxt parser context</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>ctxt</tt></i>:</span></td><td>an HTML parser context</td></tr><tr><td><span class="term"><i><tt>fd</tt></i>:</span></td><td>an open file descriptor</td></tr><tr><td><span class="term"><i><tt>URL</tt></i>:</span></td><td>the base URL to use for the document</td></tr><tr><td><span class="term"><i><tt>encoding</tt></i>:</span></td><td>the document encoding, or NULL</td></tr><tr><td><span class="term"><i><tt>options</tt></i>:</span></td><td>a combination of htmlParserOption(s)</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>the resulting document tree</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlCtxtReadFile"/>htmlCtxtReadFile ()</h3><pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadFile	(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br/>					 const char * filename, <br/>					 const char * encoding, <br/>					 int options)<br/></pre><p>parse an XML file from the filesystem or the network. This reuses the existing @ctxt parser context</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>ctxt</tt></i>:</span></td><td>an HTML parser context</td></tr><tr><td><span class="term"><i><tt>filename</tt></i>:</span></td><td>a file or URL</td></tr><tr><td><span class="term"><i><tt>encoding</tt></i>:</span></td><td>the document encoding, or NULL</td></tr><tr><td><span class="term"><i><tt>options</tt></i>:</span></td><td>a combination of htmlParserOption(s)</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>the resulting document tree</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlCtxtReadIO"/>htmlCtxtReadIO ()</h3><pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadIO		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br/>					 <a href="libxml2-xmlIO.html#xmlInputReadCallback">xmlInputReadCallback</a> ioread, <br/>					 <a href="libxml2-xmlIO.html#xmlInputCloseCallback">xmlInputCloseCallback</a> ioclose, <br/>					 void * ioctx, <br/>					 const char * URL, <br/>					 const char * encoding, <br/>					 int options)<br/></pre><p>parse an HTML document from I/O functions and source and build a tree. This reuses the existing @ctxt parser context</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>ctxt</tt></i>:</span></td><td>an HTML parser context</td></tr><tr><td><span class="term"><i><tt>ioread</tt></i>:</span></td><td>an I/O read function</td></tr><tr><td><span class="term"><i><tt>ioclose</tt></i>:</span></td><td>an I/O close function</td></tr><tr><td><span class="term"><i><tt>ioctx</tt></i>:</span></td><td>an I/O handler</td></tr><tr><td><span class="term"><i><tt>URL</tt></i>:</span></td><td>the base URL to use for the document</td></tr><tr><td><span class="term"><i><tt>encoding</tt></i>:</span></td><td>the document encoding, or NULL</td></tr><tr><td><span class="term"><i><tt>options</tt></i>:</span></td><td>a combination of htmlParserOption(s)</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>the resulting document tree</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlCtxtReadMemory"/>htmlCtxtReadMemory ()</h3><pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlDocPtr">htmlDocPtr</a>	htmlCtxtReadMemory	(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br/>					 const char * buffer, <br/>					 int size, <br/>					 const char * URL, <br/>					 const char * encoding, <br/>					 int options)<br/></pre><p>parse an XML in-memory document and build a tree. This reuses the existing @ctxt parser context</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>ctxt</tt></i>:</span></td><td>an HTML parser context</td></tr><tr><td><span class="term"><i><tt>buffer</tt></i>:</span></td><td>a pointer to a char array</td></tr><tr><td><span class="term"><i><tt>size</tt></i>:</span></td><td>the size of the array</td></tr><tr><td><span class="term"><i><tt>URL</tt></i>:</span></td><td>the base URL to use for the document</td></tr><tr><td><span class="term"><i><tt>encoding</tt></i>:</span></td><td>the document encoding, or NULL</td></tr><tr><td><span class="term"><i><tt>options</tt></i>:</span></td><td>a combination of htmlParserOption(s)</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>the resulting document tree</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlCtxtReset"/>htmlCtxtReset ()</h3><pre class="programlisting">void	htmlCtxtReset			(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt)<br/></pre><p>Reset a parser context</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>ctxt</tt></i>:</span></td><td>an HTML parser context</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlCtxtUseOptions"/>htmlCtxtUseOptions ()</h3><pre class="programlisting">int	htmlCtxtUseOptions		(<a href="libxml2-HTMLparser.html#htmlParserCtxtPtr">htmlParserCtxtPtr</a> ctxt, <br/>					 int options)<br/></pre><p>Applies the options to the parser context</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>ctxt</tt></i>:</span></td><td>an HTML parser context</td></tr><tr><td><span class="term"><i><tt>options</tt></i>:</span></td><td>a combination of htmlParserOption(s)</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>0 in case of success, the set of unknown or unimplemented options in case of error.</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlElementAllowedHere"/>htmlElementAllowedHere ()</h3><pre class="programlisting">int	htmlElementAllowedHere		(const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * parent, <br/>					 const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * elt)<br/></pre><p>Checks whether an HTML element may be a direct child of a parent element. Note - doesn't check for deprecated elements</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>parent</tt></i>:</span></td><td>HTML parent element</td></tr><tr><td><span class="term"><i><tt>elt</tt></i>:</span></td><td>HTML element</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>1 if allowed; 0 otherwise.</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlElementStatusHere"/>htmlElementStatusHere ()</h3><pre class="programlisting"><a href="libxml2-HTMLparser.html#htmlStatus">htmlStatus</a>	htmlElementStatusHere	(const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * parent, <br/>					 const <a href="libxml2-HTMLparser.html#htmlElemDesc">htmlElemDesc</a> * elt)<br/></pre><p>Checks whether an HTML element may be a direct child of a parent element. and if so whether it is valid or deprecated.</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>parent</tt></i>:</span></td><td>HTML parent element</td></tr><tr><td><span class="term"><i><tt>elt</tt></i>:</span></td><td>HTML element</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>one of HTML_VALID, HTML_DEPRECATED, <a href="libxml2-HTMLparser.html#HTML_INVALID">HTML_INVALID</a></td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlEncodeEntities"/>htmlEncodeEntities ()</h3><pre class="programlisting">int	htmlEncodeEntities		(unsigned char * out, <br/>					 int * outlen, <br/>					 const unsigned char * in, <br/>					 int * inlen, <br/>					 int quoteChar)<br/></pre><p>Take a block of UTF-8 chars in and try to convert it to an ASCII plus HTML entities block of chars out.</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>out</tt></i>:</span></td><td>a pointer to an array of bytes to store the result</td></tr><tr><td><span class="term"><i><tt>outlen</tt></i>:</span></td><td>the length of @out</td></tr><tr><td><span class="term"><i><tt>in</tt></i>:</span></td><td>a pointer to an array of UTF-8 chars</td></tr><tr><td><span class="term"><i><tt>inlen</tt></i>:</span></td><td>the length of @in</td></tr><tr><td><span class="term"><i><tt>quoteChar</tt></i>:</span></td><td>the quote character to escape (' or ") or zero.</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>0 if success, -2 if the transcoding fails, or -1 otherwise The value of @inlen after return is the number of octets consumed as the return value is positive, else unpredictable. The value of @outlen after return is the number of octets consumed.</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlEntityLookup"/>htmlEntityLookup ()</h3><pre class="programlisting">const <a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> *	htmlEntityLookup	(const <a href="libxml2-xmlstring.html#xmlChar">xmlChar</a> * name)<br/></pre><p>Lookup the given entity in EntitiesTable TODO: the linear scan is really ugly, an hash table is really needed.</p><div class="variablelist"><table border="0"><col align="left"/><tbody><tr><td><span class="term"><i><tt>name</tt></i>:</span></td><td>the entity name</td></tr><tr><td><span class="term"><i><tt>Returns</tt></i>:</span></td><td>the associated <a href="libxml2-HTMLparser.html#htmlEntityDescPtr">htmlEntityDescPtr</a> if found, NULL otherwise.</td></tr></tbody></table></div></div>        <hr/>        <div class="refsect2" lang="en"><h3><a name="htmlEntityValueLookup"/>htmlEntityValueLookup ()</h3><pre class="programlisting">const <a href="libxml2-HTMLparser.html#htmlEntityDesc">htmlEntityDesc</a> *	htmlEntityValueLookup	(unsigned int value)<br/>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -