📄 indexer.conf
字号:
Disallow Regex \.r[0-9][0-9]$ \.a[0-9][0-9]$ \.so\.[0-9]$###########################################################################CheckOnly [Match|NoMatch] [NoCase|Case] [String|Regex] <arg> [<arg> ... ]# The meaning of first three optional parameters is exactly the same # with "Allow" command.# Indexer will use HEAD instead of GET HTTP method for URLs that# match/do not match given regular expressions. It means that the file # will be checked only for being existing and will not be downloaded. # Useful for zip,exe,arj and other binary files.# Note that you can disallow those files with commands given below.# You may use several arguments for one "CheckOnly" commands.# Useful for example for searching through the URL names rather than# the contents (a la FTP-search).# Takes global effect for config file.## Check some known non-text extensions using "string" match:#CheckOnly *.b *.sh *.md5#CheckOnly *.arj *.tar *.zip *.tgz *.gz#CheckOnly *.lha *.lzh *.rar *.zoo *.tar*.Z#CheckOnly *.gif *.jpg *.jpeg *.bmp *.tiff #CheckOnly *.vdo *.mpeg *.mpe *.mpg *.avi *.movie#CheckOnly *.mid *.mp3 *.rm *.ram *.wav *.aiff#CheckOnly *.vrml *.wrl *.png#CheckOnly *.exe *.cab *.dll *.bin *.class#CheckOnly *.tex *.texi *.xls *.doc *.texinfo#CheckOnly *.rtf *.pdf *.cdf *.ps#CheckOnly *.ai *.eps *.ppt *.hqx#CheckOnly *.cpt *.bms *.oda *.tcl#CheckOnly *.rpm *.m3u *.qt *.mov#CheckOnly *.map *.aif *.sit *.sea## or check ANY except known text extensions using "regex" match:#CheckOnly NoMatch Regex \/$|\.html$|\.shtml$|\.phtml$|\.php$|\.txt$#CheckOnly NoMatch Regex &rand=[0-9][0-9][0-9][0-9]$|myagenda\.php.*$|whoisonline\.php.*$###########################################################################HrefOnly [Match|NoMatch] [NoCase|Case] [String|Regex] <arg> [<arg> ... ]# The meaning of first three optional parameters is exactly the same # with "Allow" command.## Use this to scan a HTML page for "href" tags but not to index the contents# of the page with an URLs that match (doesn't match) given argument.# Commands have global effect for all configuration file.## When indexing large mail list archives for example, the index and thread# index pages (like mail.10.html, thread.21.html, etc.) should be scanned # for links but shouldn't be indexed:##HrefOnly */mail*.html */thread*.htmlHrefOnly Match *dk_sid=*HrefOnly Match *indexer_login.php*HrefOnly Match */your.domain.com/index.php*HrefOnly Match */document.php*HrefOnly Match */courses/*/index.phpHrefOnly Match */courses/*/HrefOnly Match */document/headerpage.php*HrefOnly Match */document/slideshow.php*###########################################################################CheckMp3 [Match|NoMatch] [NoCase|Case] [String|Regex] <arg> [<arg> ...]# The meaning of first three optional parameters is exactly the same # with "Allow" command.# If an URL matches given rules, indexer will download only a little part# of the document and try to find MP3 tags in it. On success, indexer# will parse MP3 tags, else it will download whole document then parse# it as usual.# Notes:# This works only with those servers which support HTTP/1.1 protocol.# It is used "Range: bytes" header to download mp3 tag.#CheckMp3 *.bin *.mp3########################################################################CheckMP3Only [Match|NoMatch] [NoCase|Case] [String|Regex] <arg> [<arg> ...]# The meaning of first three optional parameters is exactly the same # with "Allow" command.# If an URL matches given rules, indexer, like in the case CheckMP3 command,# will download only a little part of the document and try to find MP3 tags.# On success, indexer will parse MP3 tags, else it will NOT download whole # document.#CheckMP3Only *.bin *.mp3# How to combine Allow, Disallow, CheckOnly, HrefOnly commands.## indexer compares URLs against all these command arguments in the # order of their appearance in indexer.conf file. # If indexer finds that URL matches some rule, it will make a decision of what # to do with this URL, allow it, disallow it or use HEAD instead # of the GET method. So, you may use different Allow, Disallow,# CheckOnly, HrefOnly commands order.# If no one of these commands are given, mnoGoSearch will allow everything # by default.## There are many possible combinations. Samples of two of them are here:## Sample of first useful combination.# Disallow known non-text extensions (zip,wav etc),# then allow everything else. This sample is uncommented above (note that# there is actually no "Allow *" command, it is added automatically after# indexer.conf loading).## Sample of second combination.# Allow some known text extensions (html, txt) and directory index ( / ), # then disallow everything else:##Allow .html .txt */#Disallow *# HoldBadHrefs <time># Default valie is 0.# How much time to hold URLs with erroneous status before deleting them# from the database. For example, if host is down, indexer will not delete# pages from this site immediately and search will use previous content# of these pages.. However if site doesn't respond for a month, probably # it's time to remove these pages from the database.# For <time> format see description of Period command.#HoldBadHrefs 30d################################################################# Section 3.# Mime types and external parsers.#################################################################UseRemoteContentType yes/no# Default value is yes.# This command specifies if the indexer should get content type# from HTTP server headers (yes) or from it's AddType settings (no).# If set to 'no' and the indexer could not determine content-type# by using its AddType settings, then it will use HTTP header.# Default: yes#UseRemoteContentType yes#################################################################AddType [String|Regex] [Case|NoCase] <mime type> <arg> [<arg>...]# This command associates filename extensions (for services# that don't automatically include them) with their mime types.# Currently "file:" protocol uses these commands.# Use optional first two parameter to choose comparison type.# Default type is "String" "NoCase" (case sensitive string match with# '?' and '*' wildcards for one and several characters correspondingly).#AddType image/x-xpixmap *.xpmAddType image/x-xbitmap *.xbmAddType image/gif *.gifAddType text/plain *.txt *.pl *.js *.h *.c *.pm *.eAddType text/html *.html *.htmAddType text/rtf *.rtfAddType application/pdf *.pdfAddType application/msword *.docAddType application/vnd.ms-excel *.xlsAddType text/x-postscript *.ps# You may also use quotes in mime type definition# for example to specify charset. e.g. Russian webmasters # often use *.htm extension for windows-1251 documents and# *.html for UNIX koi8-r documents:##AddType "text/html; charset=koi8-r" *.html#AddType "text/html; charset=windows-1251" *.htm## More complicated example for rar .r00-r.99 using "Regex" match:#AddType Regex application/rar \.r[0-9][0-9]$## Default unknown type for other extensions:AddType application/unknown *.*# Mime <from_mime> <to_mime> <command line>## This is used to add support for parsing documents with mime types other# than text/plain and text/html. It can be done via external parser (which# must provide output in plain or html text) or just by substituting mime# type so indexer will understand it.## <from_mime> and <to_mime> are standard mime types# <to_mime> is either text/plain or text/html## Optional charset parameter used to change charset if needed## Command line may have $1 parameter which stands for temporary file name. # Some parsers can not operate on stdin, so indexer creates temporary file # for parser and it's name passed instead of $1. Take a look into documentation# for other parser types and parsers usage explanation.# Examples:## from_mime to_mime[charset] [command line [$1]]#Mime application/msword "text/plain; charset=utf-8" "catdoc -a -dutf-8 $1"#Mime application/x-troff-man text/plain "deroff"Mime text/x-postscript text/plain "ps2ascii"Mime application/pdf text/plain "pdftotext $1 -"Mime application/vnd.ms-excel text/plain "xls2csv $1"Mime "text/rtf*" text/html "unrtf --html $1"Mime application/rtf text/html "unrtf --html $1"# Use ParserTimeOut to specify amount of time for parser execution# to avoid possible indexer hang.ParserTimeOut 300########################################################################## Section 4.# Aliases configuration.##########################################################################Alias <master> <mirror># You can use this command for example to organize search through # master site by indexing a mirror site. It is also useful to# index your site from local file system.# mnoGoSearch will display URLs from <master> while searching# but go to the <mirror> while indexing.# This command has global indexer.conf file effect. # You may use several aliases in one indexer.conf.#Alias http://www.mysql.com/ http://mysql.udm.net/#Alias http://www.site.com/ file:///usr/local/apache/htdocs/##########################################################################AliasProg <command line># AliasProg is an external program that can be called, that takes a URL,# and returns the appropriate alias to stdout. Use $1 to pass a URL. This# command has global effect for whole indexer.conf.# Example:#AliasProg "echo $1 | /usr/bin/replace http://localhost/ file:///home/httpd/"######################################################################## Section 5.# Servers configuration.########################################################################Period <time># Default value is 7d (i.e. 7days).# Set reindex period.# <time> is in the form 'xxxA[yyyB[zzzC]]' # (Spaces are allowed between xxx and A and yyy and so on) # there xxx, yyy, zzz are numbers (can be negative!) # A, B, C can be one of the following: # s - second # M - minute # h - hour # d - day # m - month # y - year # (these letters are the same as in strptime/strftime functions) # # Examples:# 15s - 15 seconds# 4h30M - 4 hours and 30 minutes# 1y6m-15d - 1 year and six month minus 15 days# 1h-10M+1s - 1 hour minus 10 minutes plus 1 second## If you specify only number without any character, it is assumed# that time is given in seconds.## Can be set many times before "Server" command and# takes effect till the end of config file or till next Period command.#Period 7d########################################################################Tag <string># Use this field for your own purposes. For example for grouping# some servers into one group, etc... During search you'll be able to# limit URLs to be searched through by their tags.# Can be set multiple times before "Server" command and# takes effect till the end of config file or till next Tag command.# Default values is an empty sting########################################################################Category <string>#You may distribute documents between nested categories. Category#is a string in hex number notation. You may have up to 6 levels with#256 members per level. Empty category means the root of category tree.#Take a look into doc/categories.xml for more information.#This command means a category on first level:#Category AA#This command means a category on 5th level:#Category FFAABBCCDD########################################################################DefaultLang <string>#Default language for server. Can be used if you need language#restriction while doing search.#Default value is empty.#DefaultLang enDefaultLang fr########################################################################VaryLang <string>#Default value is empty, i.e. don't vary language.#Specify languages list for multilingual servers indexing#VaryLang "ru en fr de" VaryLang "fr nl en de es"########################################################################MaxHops <number># Maximum way in "mouse clicks" from start url.# Default value is 256.# Can be set multiple times before "Server" command and# takes effect till the end of config file or till next MaxHops command.#MaxHops 256########################################################################MaxNetErrors <number># Maximum network errors for each server.# Default value is 16. Use 0 for unlimited errors number.# If there too many network errors on some server # (server is down, host unreachable, etc) indexer will try to do # not more then 'number' attempts to connect to this server.# Takes effect till the end of config file or till next MaxNetErrors command.#MaxNetErrors 16########################################################################ReadTimeOut <time># Connect timeout and stalled connections timeout.# For <time> format see description of Period above.# Default value is 30 seconds.# Can be set any times before "Server" command and# takes effect till the end of config file or till next ReadTimeOut command.#ReadTimeOut 30s########################################################################DocTimeOut <time># Maximum amount of time indexer spends for one document downloading.# For <time> format see description of Period above.# Default value is 90 seconds.# Can be set any times before "Server" command and# takes effect till the end of config file or till next DocTimeOut command.DocTimeOut 600s
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -