⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 html::parser.3

📁 视频监控网络部分的协议ddns,的模块的实现代码,请大家大胆指正.
💻 3
📖 第 1 页 / 共 4 页
字号:
would be reported in between are suppressed.  The ignored elements cancontain nested occurrences of itself.  Example:.Sp.Vb 1\&   $p\->ignore_elements(qw(script style));.Ve.SpThe \f(CW\*(C`script\*(C'\fR and \f(CW\*(C`style\*(C'\fR tags will always nest properly since theircontent is parsed in \s-1CDATA\s0 mode.  For most other tags\&\f(CW\*(C`ignore_elements\*(C'\fR must be used with caution since \s-1HTML\s0 is often not\&\fIwell formed\fR..ie n .IP "$p\fR\->ignore_tags( \f(CW@tags )" 4.el .IP "\f(CW$p\fR\->ignore_tags( \f(CW@tags\fR )" 4.IX Item "$p->ignore_tags( @tags )"Any \f(CW\*(C`start\*(C'\fR and \f(CW\*(C`end\*(C'\fR events involving any of the tags given aresuppressed.  To reset the filter (i.e. don't suppress any \f(CW\*(C`start\*(C'\fR and\&\f(CW\*(C`end\*(C'\fR events), call \f(CW\*(C`ignore_tags\*(C'\fR without an argument..ie n .IP "$p\fR\->report_tags( \f(CW@tags )" 4.el .IP "\f(CW$p\fR\->report_tags( \f(CW@tags\fR )" 4.IX Item "$p->report_tags( @tags )"Any \f(CW\*(C`start\*(C'\fR and \f(CW\*(C`end\*(C'\fR events involving any of the tags \fInot\fR givenare suppressed.  To reset the filter (i.e. report all \f(CW\*(C`start\*(C'\fR and\&\f(CW\*(C`end\*(C'\fR events), call \f(CW\*(C`report_tags\*(C'\fR without an argument..PPInternally, the system has two filter lists, one for \f(CW\*(C`report_tags\*(C'\fRand one for \f(CW\*(C`ignore_tags\*(C'\fR, and both filters are applied.  Thiseffectively gives \f(CW\*(C`ignore_tags\*(C'\fR precedence over \f(CW\*(C`report_tags\*(C'\fR..PPExamples:.PP.Vb 2\&   $p\->ignore_tags(qw(style));\&   $p\->report_tags(qw(script style));.Ve.PPresults in only \f(CW\*(C`script\*(C'\fR events being reported..Sh "Argspec".IX Subsection "Argspec"Argspec is a string containing a comma-separated list that describesthe information reported by the event.  The following argspecidentifier names can be used:.ie n .IP """attr""" 4.el .IP "\f(CWattr\fR" 4.IX Item "attr"Attr causes a reference to a hash of attribute name/value pairs to bepassed..SpBoolean attributes' values are either the value set by\&\f(CW$p\fR\->boolean_attribute_value, or the attribute name if no value has beenset by \f(CW$p\fR\->boolean_attribute_value..SpThis passes undef except for \f(CW\*(C`start\*(C'\fR events..SpUnless \f(CW\*(C`xml_mode\*(C'\fR or \f(CW\*(C`case_sensitive\*(C'\fR is enabled, the attributenames are forced to lower case..SpGeneral entities are decoded in the attribute values andone layer of matching quotes enclosing the attribute values is removed..SpThe Unicode character set is assumed for entity decoding.  With Perlversion 5.6 or earlier only the Latin\-1 range is supported, andentities for characters outside the range 0..255 are left unchanged..ie n .IP "@attr" 4.el .IP "\f(CW@attr\fR" 4.IX Item "@attr"Basically the same as \f(CW\*(C`attr\*(C'\fR, but keys and values are passed asindividual arguments and the original sequence of the attributes iskept.  The parameters passed will be the same as the \f(CW@attr\fR calculatedhere:.Sp.Vb 1\&   @attr = map { $_ => $attr\->{$_} } @$attrseq;.Ve.Spassuming \f(CW$attr\fR and \f(CW$attrseq\fR here are the hash and array passed as theresult of \f(CW\*(C`attr\*(C'\fR and \f(CW\*(C`attrseq\*(C'\fR argspecs..SpThis passes no values for events besides \f(CW\*(C`start\*(C'\fR..ie n .IP """attrseq""" 4.el .IP "\f(CWattrseq\fR" 4.IX Item "attrseq"Attrseq causes a reference to an array of attribute names to bepassed.  This can be useful if you want to walk the \f(CW\*(C`attr\*(C'\fR hash inthe original sequence..SpThis passes undef except for \f(CW\*(C`start\*(C'\fR events..SpUnless \f(CW\*(C`xml_mode\*(C'\fR or \f(CW\*(C`case_sensitive\*(C'\fR is enabled, the attributenames are forced to lower case..ie n .IP """column""" 4.el .IP "\f(CWcolumn\fR" 4.IX Item "column"Column causes the column number of the start of the event to be passed.The first column on a line is 0..ie n .IP """dtext""" 4.el .IP "\f(CWdtext\fR" 4.IX Item "dtext"Dtext causes the decoded text to be passed.  General entities areautomatically decoded unless the event was inside a \s-1CDATA\s0 section orwas between literal start and end tags (\f(CW\*(C`script\*(C'\fR, \f(CW\*(C`style\*(C'\fR,\&\f(CW\*(C`xmp\*(C'\fR, and \f(CW\*(C`plaintext\*(C'\fR)..SpThe Unicode character set is assumed for entity decoding.  With Perlversion 5.6 or earlier only the Latin\-1 range is supported, andentities for characters outside the range 0..255 are left unchanged..SpThis passes undef except for \f(CW\*(C`text\*(C'\fR events..ie n .IP """event""" 4.el .IP "\f(CWevent\fR" 4.IX Item "event"Event causes the event name to be passed..SpThe event name is one of \f(CW\*(C`text\*(C'\fR, \f(CW\*(C`start\*(C'\fR, \f(CW\*(C`end\*(C'\fR, \f(CW\*(C`declaration\*(C'\fR,\&\f(CW\*(C`comment\*(C'\fR, \f(CW\*(C`process\*(C'\fR, \f(CW\*(C`start_document\*(C'\fR or \f(CW\*(C`end_document\*(C'\fR..ie n .IP """is_cdata""" 4.el .IP "\f(CWis_cdata\fR" 4.IX Item "is_cdata"Is_cdata causes a \s-1TRUE\s0 value to be passed if the event is inside a \s-1CDATA\s0section or between literal start and end tags (\f(CW\*(C`script\*(C'\fR,\&\f(CW\*(C`style\*(C'\fR, \f(CW\*(C`xmp\*(C'\fR, and \f(CW\*(C`plaintext\*(C'\fR)..Spif the flag is \s-1FALSE\s0 for a text event, then you should normallyeither use \f(CW\*(C`dtext\*(C'\fR or decode the entities yourself before the text isprocessed further..ie n .IP """length""" 4.el .IP "\f(CWlength\fR" 4.IX Item "length"Length causes the number of bytes of the source text of the event tobe passed..ie n .IP """line""" 4.el .IP "\f(CWline\fR" 4.IX Item "line"Line causes the line number of the start of the event to be passed.The first line in the document is 1.  Line counting doesn't startuntil at least one handler requests this value to be reported..ie n .IP """offset""" 4.el .IP "\f(CWoffset\fR" 4.IX Item "offset"Offset causes the byte position in the \s-1HTML\s0 document of the start ofthe event to be passed.  The first byte in the document has offset 0..ie n .IP """offset_end""" 4.el .IP "\f(CWoffset_end\fR" 4.IX Item "offset_end"Offset_end causes the byte position in the \s-1HTML\s0 document of the end ofthe event to be passed.  This is the same as \f(CW\*(C`offset\*(C'\fR + \f(CW\*(C`length\*(C'\fR..ie n .IP """self""" 4.el .IP "\f(CWself\fR" 4.IX Item "self"Self causes the current object to be passed to the handler.  If thehandler is a method, this must be the first element in the argspec..SpAn alternative to passing self as an argspec is to register closuresthat capture \f(CW$self\fR by themselves as handlers.  Unfortunately thiscreates circular references which prevent the HTML::Parser objectfrom being garbage collected.  Using the \f(CW\*(C`self\*(C'\fR argspec avoids thisproblem..ie n .IP """skipped_text""" 4.el .IP "\f(CWskipped_text\fR" 4.IX Item "skipped_text"Skipped_text returns the concatenated text of all the events that havebeen skipped since the last time an event was reported.  Events mightbe skipped because no handler is registered for them or because somefilter applies.  Skipped text also includes marked section markup,since there are no events that can catch it..SpIf an \f(CW""\fR\-handler is registered for an event, then the text for thisevent is not included in \f(CW\*(C`skipped_text\*(C'\fR.  Skipped text both beforeand after the \f(CW""\fR\-event is included in the next reported\&\f(CW\*(C`skipped_text\*(C'\fR..ie n .IP """tag""" 4.el .IP "\f(CWtag\fR" 4.IX Item "tag"Same as \f(CW\*(C`tagname\*(C'\fR, but prefixed with \*(L"/\*(R" if it belongs to an \f(CW\*(C`end\*(C'\fRevent and \*(L"!\*(R" for a declaration.  The \f(CW\*(C`tag\*(C'\fR does not have any prefixfor \f(CW\*(C`start\*(C'\fR events, and is in this case identical to \f(CW\*(C`tagname\*(C'\fR..ie n .IP """tagname""" 4.el .IP "\f(CWtagname\fR" 4.IX Item "tagname"This is the element name (or \fIgeneric identifier\fR in \s-1SGML\s0 jargon) forstart and end tags.  Since \s-1HTML\s0 is case insensitive, this name isforced to lower case to ease string matching..SpSince \s-1XML\s0 is case sensitive, the tagname case is not changed when\&\f(CW\*(C`xml_mode\*(C'\fR is enabled.  The same happens if the \f(CW\*(C`case_sensitive\*(C'\fR attributeis set..SpThe declaration type of declaration elements is also passed as a tagname,even if that is a bit strange.In fact, in the current implementation tagname isidentical to \f(CW\*(C`token0\*(C'\fR except that the name may be forced to lower case..ie n .IP """token0""" 4.el .IP "\f(CWtoken0\fR" 4.IX Item "token0"Token0 causes the original text of the first token string to bepassed.  This should always be the same as \f(CW$tokens\fR\->[0]..SpFor \f(CW\*(C`declaration\*(C'\fR events, this is the declaration type..SpFor \f(CW\*(C`start\*(C'\fR and \f(CW\*(C`end\*(C'\fR events, this is the tag name..SpFor \f(CW\*(C`process\*(C'\fR and non-strict \f(CW\*(C`comment\*(C'\fR events, this is everythinginside the tag..SpThis passes undef if there are no tokens in the event..ie n .IP """tokenpos""" 4.el .IP "\f(CWtokenpos\fR" 4.IX Item "tokenpos"Tokenpos causes a reference to an array of token positions to bepassed.  For each string that appears in \f(CW\*(C`tokens\*(C'\fR, this arraycontains two numbers.  The first number is the offset of the start ofthe token in the original \f(CW\*(C`text\*(C'\fR and the second number is the lengthof the token..SpBoolean attributes in a \f(CW\*(C`start\*(C'\fR event will have (0,0) for theattribute value offset and length..SpThis passes undef if there are no tokens in the event (e.g., \f(CW\*(C`text\*(C'\fR)and for artificial \f(CW\*(C`end\*(C'\fR events triggered by empty element tags..SpIf you are using these offsets and lengths to modify \f(CW\*(C`text\*(C'\fR, youshould either work from right to left, or be very careful to calculatethe changes to the offsets..ie n .IP """tokens""" 4.el .IP "\f(CWtokens\fR" 4.IX Item "tokens"Tokens causes a reference to an array of token strings to be passed.The strings are exactly as they were found in the original text,no decoding or case changes are applied..SpFor \f(CW\*(C`declaration\*(C'\fR events, the array contains each word, comment, anddelimited string starting with the declaration type..SpFor \f(CW\*(C`comment\*(C'\fR events, this contains each sub-comment.  If\&\f(CW$p\fR\->strict_comments is disabled, there will be only one sub-comment..SpFor \f(CW\*(C`start\*(C'\fR events, this contains the original tag name followed bythe attribute name/value pairs.  The values of boolean attributes willbe either the value set by \f(CW$p\fR\->boolean_attribute_value, or theattribute name if no value has been set by\&\f(CW$p\fR\->boolean_attribute_value..SpFor \f(CW\*(C`end\*(C'\fR events, this contains the original tag name (always one token)..SpFor \f(CW\*(C`process\*(C'\fR events, this contains the process instructions (always onetoken)..SpThis passes \f(CW\*(C`undef\*(C'\fR for \f(CW\*(C`text\*(C'\fR events..ie n .IP """text""" 4.el .IP "\f(CWtext\fR" 4.IX Item "text"Text causes the source text (including markup element delimiters) to bepassed..ie n .IP """undef""" 4.el .IP "\f(CWundef\fR" 4.IX Item "undef"Pass an undefined value.  Useful as padding where the same handlerroutine is registered for multiple events..ie n .IP "\*(Aq...\*(Aq" 4.el .IP "\f(CW\*(Aq...\*(Aq\fR" 4.IX Item "..."A literal string of 0 to 255 characters enclosedin single (') or double (") quotes is passed as entered..PPThe whole argspec string can be wrapped up in \f(CW\*(Aq@{...}\*(Aq\fR to signalthat the resulting event array should be flattened.  This only makes adifference if an array reference is used as the handler target.Consider this example:.PP.Vb 2\&   $p\->handler(text => [], \*(Aqtext\*(Aq);\&   $p\->handler(text => [], \*(Aq@{text}\*(Aq]);.Ve.PPWith two text events; \f(CW"foo"\fR, \f(CW"bar"\fR; then the first example will endup with [[\*(L"foo\*(R"], [\*(L"bar\*(R"]] and the second with [\*(L"foo\*(R", \*(L"bar\*(R"] inthe handler target array..Sh "Events".IX Subsection "Events"Handlers for the following events can be registered:.ie n .IP """comment""" 4.el .IP "\f(CWcomment\fR" 4.IX Item "comment"This event is triggered when a markup comment is recognized..SpExample:.Sp.Vb 1\&  <!\-\- This is a comment \-\- \-\- So is this \-\->.Ve.ie n .IP """declaration""" 4.el .IP "\f(CWdeclaration\fR" 4.IX Item "declaration"This event is triggered when a \fImarkup declaration\fR is recognized..SpFor typical \s-1HTML\s0 documents, the only declaration you arelikely to find is <!DOCTYPE ...>..SpExample:.Sp.Vb 2\&  <!DOCTYPE HTML PUBLIC "\-//W3C//DTD HTML 4.01//EN"\&  "http://www.w3.org/TR/html40/strict.dtd">.Ve.SpDTDs inside <!DOCTYPE ...> will confuse HTML::Parser..ie n .IP """default""" 4.el .IP "\f(CWdefault\fR" 4.IX Item "default"This event is triggered for events that do not have a specifichandler.  You can set up a handler for this event to catch stuff youdid not want to catch explicitly..ie n .IP """end""" 4.el .IP "\f(CWend\fR" 4.IX Item "end"This event is triggered when an end tag is recognized..SpExample:.Sp

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -