⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 bogofilter.1in

📁 一个C语言写的快速贝叶斯垃圾邮件过滤工具
💻 1IN
📖 第 1 页 / 共 2 页
字号:
where to write its output in passthrough mode. Note that this only works when \-p is explicitly given..PPPARAMETER OPTIONS.PPThe\fB\-E \fR\fB\fIvalue\fR\fI[,value]\fR\fRoption allows setting the sp\-esf value and the ns\-esf value. With two values, both sp\-esf and ns\-esf are set. If only one value is given, parameters are set as described in the note below..PPThe\fB\-m \fR\fB\fIvalue\fR\fI[,value]\fR\fI[,value]\fR\fRoption allows setting the min\-dev value and, optionally, the robs and robx values. With three values, min\-dev, robs, and robx are all set. If fewer values are given, parameters are set as described in the note below..PPThe\fB\-o \fR\fB\fIvalue\fR\fI[,value]\fR\fRoption allows setting the spam\-cutoff ham\-cutoff values. With two values, both spam\-cutoff and ham\-cutoff are set. If only one value is given, parameters are set as described in the note below..PPNote: All of these options allow fewer values to be provided. Values can be skipped by using just the comma delimiter, in which case the corresponding parameter(s) won't be changed. If only the first value is provided, then only the first parameter is set. Trailing values can be skipped, in which case the corresponding parameters won't be changed. Within the parameter list, spaces are not allowed after commas..PPINFO OPTIONS.PPThe\fB\-v\fRoption produces a report to standard output onbogofilter's analysis of the input. Each additional\fBv\fRwill increase the verbosity of the output, up to a maximum of 4. With\fB\-vv\fR, the report lists the tokens with highest deviation from a mean of 0.5 association with spam..PPOption\fB\-y date\fRcan be used to override the current date when timestamping tokens. A value of zero (0) turns off timestamping..PPThe\fB\-D\fRoption redirects debug output to stdout..PPThe\fB\-x \fR\fB\fIflags\fR\fRoption allows setting of debug flags for printing debug information. See header file debug.h for the list of usable flags..PPCONFIG FILE OPTIONS.PPUsing GNU longopt\fB\-\-\fRsyntax, a config file's\fB\fIname=value\fR\fRstatement becomes a command line's\fB\-\-\fR\fB\fIoption=value\fR\fR. Use command\fBbogofilter \-\-help\fRfor a list of options and see\fIbogofilter.cf.example\fRfor more info on them. For example to change the X\-Bogosity header to "X\-Spam\-Header", use:.PP\fB\fI\-\-spam\-header\-name=X\-Spam\-Header\fR\fR.SH "ENVIRONMENT".PPBogofilteruses a database directory, which can be set in the config file. If not set there,bogofilterwill use the value of\fBBOGOFILTER_DIR\fR. Both can be overridden by the\fB\-d \fR\fB\fIdir\fR\fRoption. If none of that is available,bogofilterwill use directory\fI$HOME/.bogofilter\fR..SH "CONFIGURATION".PPThebogofiltercommand line allows setting of many options that determine howbogofilteroperates. File\fI@sysconfdir@/bogofilter.cf\fRcan be used to set additional parameters that affect its operation. File\fI@sysconfdir@/bogofilter.cf.example\fRhas samples of all of the parameters. Status and logging messages can be customized for each site..SH "RETURN VALUES".PP0 for spam; 1 for non\-spam; 2 for unsure ; 3 for I/O or other errors..PPIf both\fB\-p\fRand\fB\-e\fRare used, the return values are: 0 for spam or non\-spam; 3 for I/O or other errors..PPError 3 usually means that the wordlist filebogofilterwants to read at startup is missing or the hard disk has filled up in\fB\-p\fRmode..SH "INTEGRATION WITH OTHER TOOLS".PPUse with procmail.PPThe following recipe (a) spam\-bins anything thatbogofilterrates as spam, (b) registers the words in messages rated as spam as such, and (c) registers the words in messages rated as non\-spam as such. With this in place, it will normally only be necessary for the user to intervene (with\fB\-Ns\fRor\fB\-Sn\fR) whenbogofiltermiscategorizes something..sp.nf# filter mail through bogofilter, tagging it as Ham, Spam, or Unsure,# and updating the wordlist:0fw| bogofilter \-u \-e \-p# if bogofilter failed, return the mail to the queue;# the MTA will retry to deliver it later# 75 is the value for EX_TEMPFAIL in /usr/include/sysexits.h:0e{ EXITCODE=75 HOST }# file the mail to \fIspam\-bogofilter\fR if it's spam.:0:* ^X\-Bogosity: Spam, tests=bogofilterspam\-bogofilter# file the mail to \fIunsure\-bogofilter\fR # if it's neither ham nor spam.:0:* ^X\-Bogosity: Unsure, tests=bogofilterunsure\-bogofilter# With this recipe, you can train bogofilter starting with an empty# wordlist.  Be sure to check your unsure\-folder regularly, take the# messages out of it, classify them as ham (or spam), and use them to# train bogofilter..fi.PPThe following procmail rule will take mail on stdin and save it to file\fIspam\fRifbogofilterthinks it's spam:.sp.nf:0HB:* ? bogofilterspam.fi.spand this similar rule will also register the tokens in the mail according to thebogofilterclassification:.sp.nf:0HB:* ? bogofilter \-uspam.fi.sp.PPIfbogofilterfails (returning 3) the message will be treated as non\-spam..PPThis one is formaildrop, it automatically defers the mail and retries later when thexfiltercommand fails, use this in your\fI~/.mailfilter\fR:.sp.nfxfilter "bogofilter \-u \-e \-p"if (/^X\-Bogosity: Spam, tests=bogofilter/){  to "spam\-bogofilter"}.fi.PPThe following\fI.muttrc\fRlines will create mutt macros for dispatching mail tobogofilter..sp.nfmacro index d "<enter\-command>unset wait_key\\n\\<pipe\-entry>bogofilter \-n\\n\\<enter\-command>set wait_key\\n\\<delete\-message>" "delete message as non\-spam"macro index \\ed "<enter\-command>unset wait_key\\n\\<pipe\-entry>bogofilter \-s\\n\\<enter\-command>set wait_key\\n\\<delete\-message>" "delete message as spam".fi.PPIntegration with Mail Transport Agent (MTA).TP 31.bogofiltercan also be integrated into an MTA to filter all incoming mail. While the specific implementation is MTA dependent, the general steps are as follows:.TP2.Installbogofilteron the mail server.TP3.Prime thebogofilterdatabases with a spam and non\-spam corpus. Sincebogofilterwill be serving a larger community, it is important to prime it with a representative set of messages..TP4.Set up the MTA to invokebogofilteron each message. While this is an MTA specific step, you'll probably need to use the\fB\-p\fR,\fB\-u\fR, and\fB\-e\fRoptions..TP5.Set up a mechanism for users to register spam/non\-spam messages, as well as to correct mis\-classifications. The most generic solution is to set up alias email addresses to which users bounce messages..TP6.See the\fIdoc\fRand\fIcontrib\fRdirectories for more information..PPUse of R to verifybogofilter's calculations.PPThe \-R option tellsbogofilterto generate an R data frame. The data frame contains one row per token analyzed. Each such row contains the token, the sum of its database "good" and "spam" counts, the "good" count divided by the number of non\-spam messages used to create the training database, the "spam" count divided by the spam message count, Robinson's f(w) for the token, the natural logs of (1 \- f(w)) and f(w), and an indicator character (+ if the token's f(w) value exceeded the minimum deviation from 0.5, \- if it didn't). There is one additional row at the end of the table that contains a label in the token field, followed by the number of words actually used (the ones with + indicators), Robinson's P, Q, S, s and x values and the minimum deviation..PPThe R data frame can be saved to a file and later read into an R session (see[5]\&\fIthe R project website\fRfor information about the mathematics package R). Provided with thebogofilterdistribution is a simple R script (file bogo.R) that can be used to verifybogofilter's calculations. Instructions for its use are included in the script in the form of comments..SH "LOG MESSAGES".PPBogofilterwrites messages to the system log when the\fB\-l\fRoption is used. What is written depends on which other flags are used..PPA classification run will generate (we are not showing the date and host part here):.sp.nfbogofilter[1412]: X\-Bogosity: Ham, spamicity=0.000227bogofilter[1415]: X\-Bogosity: Spam, spamicity=0.998918.fi.PPUsing\fB\-u\fRto classify a message and update a wordlist will produce (one a single line):.sp.nfbogofilter[1426]: X\-Bogosity: Spam, spamicity=0.998918,  register \-s, 329 words, 1 messages    .fi.PPRegistering words (\fB\-l\fRand\fB\-s\fR,\fB\-n\fR,\fB\-S\fR, or\fB\-N\fR) will produce:.sp.nfbogofilter[1440]: register\-n, 255 words, 1 messages.fi.PPA registration run (using\fB\-s\fR,\fB\-n\fR,\fB\-N\fR, or\fB\-S\fR) will generate messages like:.sp.nfbogofilter[17330]: register\-n, 574 words, 3 messagesbogofilter[6244]: register\-s, 1273 words, 4 messages.fi.SH "FILES".TP\fI@sysconfdir@/bogofilter.cf\fRSystem configuration file..TP\fI~/.bogofilter.cf\fRUser configuration file..TP\fI~/.bogofilter/wordlist.db\fRCombined list of good and spam tokens..SH "AUTHOR".sp.nfEric S. Raymond <esr@thyrsus.com>.David Relson <relson@osagesoftware.com>.Matthias Andree <matthias.andree@gmx.de>.Greg Louis <glouis@dynamicro.on.ca>..fi.PPFor updates, see the[6]\&\fI bogofilter project page\fR..SH "SEE ALSO ".PPbogolexer(1), bogotune(1), bogoupgrade(1), bogoutil(1).SH "REFERENCES".TP 31.\ A Plan For Spam\%http://www.paulgraham.com/spam.html.TP 32.\ Spam Detection\%http://radio.weblogs.com/0101454/stories/2002/09/16/spamDetection.html.TP 33.\ A Statistical Approach to the Spam Problem\%http://www.linuxjournal.com/article/6467.TP 34.\ Another improvement\%http://www.garyrobinson.net/2004/04/improved_chi.html.TP 35.\ the R project website\%http://cran.r\-project.org/.TP 36.\ bogofilter project page\%http://bogofilter.sourceforge.net/

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -