⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 sgmls.txt

📁 harvest是一个下载html网页得机器人
💻 TXT
📖 第 1 页 / 共 2 页
字号:
SGMLS(1)                                                 SGMLS(1)NAME       sgmls - a validating SGML parser       An SGML System Conforming to       International Standard ISO 8879 --       Standard Generalized Markup LanguageSYNOPSIS       sgmls  [  -deglprsuv  ] [ -cfile ] [ -iname ] [ -mfile ] [       filename...  ]DESCRIPTION       Sgmls parses and validates the  SGML  document  entity  in       filename...   and  prints  on the standard output a simple       ASCII representation of its Element Structure  Information       Set.   (This  is  the  information  set which a structure-       controlled conforming SGML application should  act  upon.)       Note  that  the document entity may be spread amongst sev-       eral files; for example, the  SGML  declaration,  document       type  declaration  and document instance set could each be       in a separate file.  If no filenames are  specified,  then       sgmls  will  read  the  document  entity from the standard       input.  A filename of - can also be used to refer  to  the       standard input.       The following options are available:       -cfile Report  any  capacity  limits that are exceeded and              write a report of  capacity  usage  to  file.   The              report  is in the format of a RACT result.  RACT is              the  Reference  Application  for  Capacity  Testing              defined  in the Proposed American National Standard              Conformance Testing for Standard Generalized Markup              Language  (SGL)  Systems  (X3.190-199X), Draft July              1991.       -d     Warn about duplicate entity declarations.       -e     Describe open entities in  error  messages.   Error              messages  always  include  the position of the most              recently opened external entity.       -g     Show the GIs of open elements in error messages.       -iname Pretend that                     <!ENTITY % name "INCLUDE">              occurs at the start of the document  type  declara-              tion  subset  in  the  SGML document entity.  Since              repeated definitions of an entity are ignored, this              definition will take precedence over any other def-              initions of this entity in the document type decla-              ration.   Multiple  -i options are allowed.  If the                                                                1SGMLS(1)                                                 SGMLS(1)              SGML declaration replaces the reserved name INCLUDE              then  the new reserved name will be the replacement              text of the entity.  Typically  the  document  type              declaration will contain                     <!ENTITY % name "IGNORE">              and  will use %name; in the status keyword specifi-              cation of a marked section  declaration.   In  this              case  the effect of the option will be to cause the              marked section not to be ignored.       -l     Output L commands giving the  current  line  number              and filename.       -mfile Map  public  identifiers and entity names to system              identifiers using  the  catalog  entry  file  file.              Multiple  -m  options  are  allowed.  Catalog entry              files specified with the -m option will be searched              before the defaults.       -p     Parse only the prolog.  Sgmls will exit after pars-              ing the document type declaration.  Implies -s.       -r     Warn about defaulted references.       -s     Suppress output.   Error  messages  will  still  be              printed.       -u     Warn about undefined elements: elements used in the              DTD but not defined.       -v     Print the version number.   Entity Manager       An external entity resides in  one  or  more  files.   The       entity manager component of sgmls maps a sequence of files       into an entity in three sequential stages:       1.     each carriage return character  is  turned  into  a              non-SGML character;       2.     each  newline character is turned into a record end              character, and at the  same  time  a  record  start              character  is  inserted  at  the  beginning of each              line;       3.     the files are concatenated.       A system identifier is interpreted as a list of  filenames       separated by colons.  A filename of - can be used to refer       to the standard input.       If a system identifier is not specified, then  the  entity                                                                2SGMLS(1)                                                 SGMLS(1)       manager  can generate one using catalog entry files in the       format defined in the SGML Open Draft Technical Resolution       on  Entity  Management.   A  catalog entry file contains a       sequence of entries in one of the following four forms:       PUBLIC pubid sysid              This specifies that sysid should  be  used  as  the              system  identifier  if the the public identifier is              pubid.  Sysid is a system identifier as defined  in              ISO  8879  and  pubid  is  a  public  identifier as              defined in ISO 8879.       ENTITY name sysid              This specifies that sysid should  be  used  as  the              system identifier if the entity is a general entity              whose name is name.       ENTITY %name sysid              This specifies that sysid should  be  used  as  the              system  identifier  if  the  entity  is a parameter              entity whose name is name.  Note that there  is  no              space between the % and the name.       DOCTYPE name sysid              This  specifies  that  sysid  should be used as the              system  identifier  if  the  entity  is  an  entity              declared in a document type declaration whose docu-              ment type name is name.       The last two forms are extensions to the SGML Open format.       The  delimiters  can be omitted from the sysid provided it       does not contain any white space.   Comments  are  allowed       between  parameters delimited by -- as in SGML.  The envi-       ronment  variable  SGML_CATALOG_FILES  contains  a  colon-       separated  list  of  catalog  entry  files.  These will be       searched after any catalog entry files specified using the       -m  option.  If this environment variable is not set, then       a system dependent list of catalog  entry  files  will  be       used.   A match in a catalog entry file for a PUBLIC entry       will take precedence over a match in the same file for  an       ENTITY  or  DOCTYPE entry.  A filename in a system identi-       fier in a catalog entry file is  interpreted  relative  to       the directory containing the catalog entry file.       If no match can be found in a catalog entry file, then the       entity manager will attempt to generate a  filename  using       the public identifier (if there is one) and other informa-       tion available to it.  Notation identifiers are  not  sub-       ject to this treatment.  This process is controlled by the       environment variable SGML_PATH;  this  contains  a  colon-       separated list of filename templates.  A filename template       is a filename that may contain substitution fields; a sub-       stitution field is a % character followed by a single let-       ter that indicates the value  of  the  substitution.   The                                                                3SGMLS(1)                                                 SGMLS(1)       value  of  a substitution can either be a string or it can       be null.  The entity manager transforms the list of  file-       name  templates  into  a list of filenames by substituting       for each substitution field and  discarding  any  template       that  contained a substitution field whose value was null.       It then uses the first resulting filename that exists  and       is  readable.   Substitution values are transformed before       being used for substitution: firstly, any names that  were       subject  to  upper  case  substitution are folded to lower       case; secondly, space characters are mapped to underscores       and  slashes  are mapped to percents.  The value of the %S       field is not  transformed.   The  values  of  substitution       fields are as follows:       %%     A single %.       %D     The entity's data content notation.  This substitu-              tion will succeed only for external data  entities.       %N     The entity, notation or document type name.       %P     The public identifier if there was a public identi-              fier, otherwise null.       %S     The system identifier if there was a system identi-              fier otherwise null.       %X     (This  is  provided  mainly  for compatibility with              ARCSGML.)  A three-letter string chosen as follows:                                         |            |                                         |            | With public identifier                                         |            +-------------+-----------                                         | No public  |   Device    |  Device                                         | identifier | independent | dependent              ---------------------------+------------+-------------+-----------              Data or subdocument entity | nsd        | pns         | vns              General SGML text entity   | gml        | pge         | vge              Parameter entity           | spe        | ppe         | vpe              Document type definition   | dtd        | pdt         | vdt              Link process definition    | lpd        | plp         | vlp              The  device  dependent  version  is selected if the              public text class allows a public text display ver-              sion  but no public text display version was speci-              fied.       %Y     The type of thing for which the filename  is  being              generated:              SGML subdocument entity    sgml              Data entity                data              General text entity        text              Parameter entity           parm              Document type definition   dtd                                                                4SGMLS(1)                                                 SGMLS(1)              Link process definition    lpd       The  value  of  the  following substitution fields will be       null unless a valid formal public identifier was supplied.       %A     Null  if  the  text identifier in the formal public              identifier contains an unavailable text  indicator,              otherwise the empty string.       %C     The public text class, mapped to lower case.       %E     The   public   text  designating  sequence  (escape              sequence) if the public text class is CHARSET, oth-              erwise null.       %I     The  empty  string  if  the owner identifier in the              formal public identifier is an  ISO  owner  identi-              fier, otherwise null.       %L     The  public  text  language,  mapped to lower case,              unless the public text class is CHARSET,  in  which              case null.       %O     The  owner  identifier  (with the +// or -// prefix              stripped.)       %R     The empty string if the  owner  identifier  in  the              formal  public  identifier  is  a  registered owner              identifier, otherwise null.       %T     The public text description.       %U     The empty string if the  owner  identifier  in  the              formal  public  identifier is an unregistered owner              identifier, otherwise null.       %V     The public text display version.  This substitution              will  be  null  if  the  public text class does not              allow a display version or if no version was speci-              fied.   If  an empty version was specified, a value              of default will be used.       Normally if the external identifier for an entity includes       a system identifier, the entity manager will use the spec-       ified system identifier and not attempt to  generate  one.       If,  however, SGML_PATH uses the %S field, then the entity       manager will first search for a matching entry in the cat-       alog  entry files.  If a match is found, then this will be       used instead of the specified system  identifier.   Other-       wise,  if the specified system identifier does not contain       any colons, the entity manager will use SGML_PATH to  gen-       erate  a  filename.  Otherwise the entity manager will use       the specified system identifier.                                                                5SGMLS(1)                                                 SGMLS(1)   System declaration       The system declaration for sgmls is as follows:                          SYSTEM "ISO 8879:1986"                                  CHARSET       BASESET  "ISO 646-1983//CHARSET                 International Reference Version (IRV)//ESC 2/5 4/0"       DESCSET  0 128 0       CAPACITY PUBLIC  "ISO 8879:1986//CAPACITY Reference//EN"                                 FEATURES       MINIMIZE DATATAG NO  OMITTAG  YES   RANK     NO  SHORTTAG YES       LINK     SIMPLE  NO  IMPLICIT NO    EXPLICIT NO       OTHER    CONCUR  NO  SUBDOC   YES 1 FORMAL   YES       SCOPE    DOCUMENT       SYNTAX   PUBLIC  "ISO 8879:1986//SYNTAX Reference//EN"       SYNTAX   PUBLIC  "ISO 8879:1986//SYNTAX Core//EN"                                 VALIDATE                GENERAL YES MODEL    YES   EXCLUDE  YES CAPACITY YES                NONSGML YES SGML     YES   FORMAL   YES                                   SDIF                PACK    NO  UNPACK   NO       Exceeding a capacity limit will be ignored unless  the  -c       option is given.       The  memory usage of sgmls is not a function of the capac-       ity points used by a document; however, sgmls  can  handle       capacities significantly greater than the reference capac-

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -