manual.txt

来自「一百个病毒的源代码包括熊猫烧香等极其具有研究价值」· 文本代码 · 共 346 行
TXT
346 行
$Date: 2000/08/13 18:19:40 $Some features of modlogan are special and need some more into-the-deepdocumentation. if this documentation is not enough, unclear or/and complete non-senseplease tell me. send a mail to <jan@kneschke.de>Table Of Contents=================0. modular design1. input stream divider (logfile splitter)2. running modlogan on non-closing pipe3. input stream sorter4. searchengines5. default config file6. userdefined headers/footers7. userdefined logfile defenition0. modular design=================to understand what is described below you have to understand some of theinternals of modlogan. modlogan consists of 3 + 1 parts. - input plugin- processor plugin- output pluginand- gluecode + mainloopthe input plugin has only one important function which provides a record.if this record belongs to the next month a report is generated by outputplugin. afterwards the record will be processed by the processor plugin.this is repeated until the last record is parsed.some features are change this work flow in some ways. they are describedbelow.1. logfile splitter ===================plugin: processor/weboption: splitbyif you want to split one input stream into different output stream, you needa logfile splitter. you can use an external script as a preprocessor or usethe splitter-support of the web processor plugin.                         input stream (input plugin)                                      |                           +-- ... ---+--- ... ---+ (processor plugin)                           |          |           |                              |          |           |                       output streams (output plugins)to enable this feature add a splitby definition to the processor_web section of your config-file. a splitby definition is the following string:splitby=<field>,"<regex>",<name>where <field> is:srvhost - for the host which served the requestsrvport - for the port where the host listened atrequser - for the authenticated userrequrl  - for the requested urlreqhost - for the requesting hostrefurl  - for the referring urldefault - 'joker' which matches everything.<regex> is a regular expression which has to successfully been matched and<name> a name to group the splitted records again. <name> is also used asthe name of the subdirectory where the reports are placed.if you specify multiple splitby definitions they a checked from the first tolast. if one check is successfull the generated name is used as <name>. NOTE: the splitter has to return a name and it's your job to make sure thata name is available- by using a always matching definition  e.g. splitby=srvhost,"(.*)",$1- by specifing a 'default' definition.Examples 1:-----------let's assume that we have the following directory structure:/users/~j.kneschke/index.html/users/~project.modlogan/index.htmlthe definition--splitby=requrl,"^/users/~(.*?)/.*$",host_$1--will divide the records according to the string between the '~' and the '/'.the string will be taken and added the name./users/~j.kneschke/index.html        -> host_j.kneschke/users/~project.modlogan/index.html  -> host_project.modlogandirectories will be created (if it doesn't exist) in the directory which youspecifiy with global:outputdir with the name host_j.kneschke andhost_project.modlogan.each directory contains the reports for the respective splitted logs.Example 2:----------/users/~<username>/index.html is an alias for <username>.domain.com andmodlogan shall treat that as such.as the following two definition generate the same name they result in thereport. as the first match wins you can switch between your installationslike you want to.--splitby=requrl,"^/users/~(.*?)/.*$",host_$1splitby=srvhost,"^(.*?)\..*$",host_$1--Possible Errors---------------the splitter will notice that nothing matched and will spitout a warning and ignore the record if no 'default' stream is avaliable.--# split by all non-anymous userssplitby=user,"^([^-]+)$",user_$1--if a page is requested from a non-autheticated user the logfile will containa dash ('-'). as no splitby definition matches the record will be igroned.Support-------only a few plugins are ported now:input: everything plugin works (they don't have to be changed)processor: weboutput: modlogan2. non-closing pipes====================plugin: globaloption: gen_report_thresholdby default modlogan generates a report at the end of each parsed month and at the end of each run.   This isn't enough if you run modlogan on a non-closing pipe like:$ tail access.log | modlogan -c <config-file>In this scenario modlogan generates a report only once a month which isn'treally perfect. To circumvent this problem you can specify a number ofrecord after which a report is generated. if you set a threshold of 1000 modlogan will generate a report - after each 1000 records- at the end of each month and - at the end of the run. the last option can't become active in this scenario because modlogan won't stop by itself.3. input stream sorter======================plugin: input/clfoption: readaheadlimitif multiple servers are writing into one logfile the records are normallynot in order, but modlogan expects them to be in order. they have to besorted either by an external run of sort on the whole logfile or by theinternal logfile sorter.By enabling the internal input sorter the input plugin will read the specified amount of records, parse them and will put them into a sortedlist. the oldest record is taken from the list and returned to the mainloopfor further processing. as the internal list of parsed records isn't filledanymore the next call of the input parser will add another line the sortedlist of parsed records. the oldest record will be taken from the list andthe game repeats.you can control this feature by specifing the maximum number of lines whichhas to to be reach before a record is returned.NOTE: as the specfied number of lines is kept in memory you shouldn'tincrease the number too much.Example-------[input_clf]readaheadlimit = 1004. searchengines================plugin: processor/weboption: searchenginesrelated options: debug_searchenginesThe file modlogan.searchengines contains a list of known searchengines. Thislist is used by the 'web'-processor-module for searchstring and searchenginedetection.The list is far from complete and has to be maintained by you the user. Ifyou've found a new searchengine in your logfile (enabling 'debug_searchengines=1' in the processor section of your config-file will help you a lot) you have to add the modlogan.searchengines to use thisstring in the next run.Adding SearchEngines--------------------A entry in the searchengines description file consists of the hoststring string and the getvars part. the getvars part is used as a group string andis covered by bracket '[]'. every url that is using the a getvars which isalready found in one of those bracket has to be below this group string.the url part is a regex string.  The number after the url should be set to zero and will be used for asearchstring rating in the future.Lines startin with '#' are comments.Example:--------You've got the following output from modlogan:o SK: ?? http://www.searchalot.com/texis/open/meta/main.htm ->q=free+catalogueThis means that a known key was detected ('q') but the url is listed in themodlogan.searchengines file. Now we have to choose the right section in the searchengines-definitionfile. As the section name as the detected key we have to go to the followingsection:[q]after we've written the regex for the URL the new searchengine definition iswritten:[q]# ...# in the first attempt we try to be most accurate as possible # if other pages from the same site are hitting this section # we'll shorten the match to # searchalot.com/texis/,0searchalot.com/texis/open/meta/main.htm$,0# ...5. Default configfile===================== if you are generating reports for multiple sites you often don't want torewrite the whole configfile for each server. using a default-configfilewhich contains all the default values eases the update process a lot. theserver specific config-file is small and only contains the options that varybetween the servers: servername, grouped hosts, hidden referrer...you understand how the option from the config file interfere with theoptions of the default-configfile.there three possibilities:- the value will be overwritten by the next occurence of the same key- the value is appended to the list- only the first occurence is writtenhow each option reacts is specified in ./doc/plugin-option.txt or./doc/plugin-options.html.the third possibility is only used by the 4 keys of the global section:- inputplugin- outputplugin- processorplugin- defaultconfigfileas these option can only be written once you have to specify the pluginBEFORE you define the default-configfile as the default-configfile is parsedright at the accurance of the default-configfile option.Example-------[global]inputplugin=nulloutputplugin=modloganprocessorplugin=webdefault_configfile=modlogan.def.conf6. userdefined headers/footers==============================plugin: output/modloganoption: htmlheader, htmlfooterby default modlogan will generate complete HTML-pages with a header andfooter.   the header contains the full HTML-header (DOCTYPE, TITLE, ...),the starting BODY-tab and the standard header ("Statitics for ...").   the default footer consists of a horizontal line (<HR>), a link whichpoints the home of modlogan and the two pics which clarify that this is true HTML 4.0.in some circumstences it isn't wanted that these parts are generated. inthese cases you can supply two filenames which are used instead of thedefault header/footer.Example:--------[output_modlogan]# the page is embedded into a surrounding HTML-page via Server-Side-Includes# blank.ihtml is an empty file.htmlfooter=blank.ihtmlhtmlheader=blank.ihtml7. userdefined logfile definition=================================plugin: input/clfoption: formatby default the clf input plugin tries to find out the type of the logfile.it checks for:- common logfile (Apache: CustomLog common)- combined logfile (Apache: CustomLog combined)- squid logfileif your logfile definition is different from these three definitions you canprovide the logfile definition by the config-file. you can copy the CustomLog string directly from your httpd.conf of yourapache.  for definition of the different options you can specify read thedocumentation of 'log_mod_config' [-> Apache Manual] and for the availabilityof these option in combination with modlogan ./src/input/clf/plugin_config.h.Example:--------[input_clf]format=%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %v %p %T
manual.txt - 源码说明

本页面展示了「一百个病毒的源代码包括熊猫烧香等极其具有研究价值」中的 manual.txt 源码文件，采用文本编程语言编写，共 346 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与病毒相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?