📄 options.py

📁 用python实现的邮件过滤器
💻 PY
📖 第 1 页 / 共 5 页
字号:
     _("""Generate both unigrams (words) and bigrams (pairs of     words). However, extending an idea originally from Gary Robinson, the     message is 'tiled' into non-overlapping unigrams and bigrams,     approximating the strongest outcome over all possible tilings.     Note that to really test this option you need to retrain with it on,     so that your database includes the bigrams - if you subsequently turn     it off, these tokens will have no effect.  This option will at least     double your database size given the same training data, and will     probably at least triple it.     You may also wish to increase the max_discriminators (maximum number     of extreme words) option if you enable this option, perhaps doubling or     quadrupling it.  It's not yet clear.  Bigrams create many more hapaxes,     and that seems to increase the brittleness of minimalist training     regimes; increasing max_discriminators may help to soften that effect.     OTOH, max_discriminators defaults to 150 in part because that makes it     easy to prove that the chi-squared math is immune from numeric     problems.  Increase it too much, and insane results will eventually     result (including fatal floating-point exceptions on some boxes).     This option is experimental, and may be removed in a future release.     We would appreciate feedback about it if you use it - email     spambayes@python.org with your comments and results.     """),     BOOLEAN, RESTORE),  ),  "Hammie": (    ("train_on_filter", _("Train when filtering"), False,     _("""Train when filtering?  After filtering a message, hammie can then     train itself on the judgement (ham or spam).  This can speed things up     with a procmail-based solution.  If you do enable this, please make     sure to retrain any mistakes.  Otherwise, your word database will     slowly become useless.  Note that this option is only used by     sb_filter, and will have no effect on sb_server's POP3 proxy, or     the IMAP filter."""),     BOOLEAN, RESTORE),  ),  # These options control where Spambayes data will be stored, and in  # what form.  They are used by many Spambayes applications (including  # pop3proxy, smtpproxy, imapfilter and hammie), and mean that data  # (such as the message database) is shared between the applications.  # If this is not the desired behaviour, you must have a different  # value for each of these options in a configuration file that gets  # loaded by the appropriate application only.  "Storage" : (    ("persistent_use_database", _("Database backend"), DB_TYPE[0],     _("""SpamBayes can use either a ZODB or dbm database (quick to score     one message) or a pickle (quick to train on huge amounts of messages).     There is also (currently experimental) the ability to use a mySQL or     PostgrepSQL database."""),     ("zeo", "zodb", "cdb", "mysql", "pgsql", "dbm", "pickle"), RESTORE),    ("persistent_storage_file", _("Storage file name"), DB_TYPE[1],     _("""Spambayes builds a database of information that it gathers     from incoming emails and from you, the user, to get better and     better at classifying your email.  This option specifies the     name of the database file.  If you don't give a full pathname,     the name will be taken to be relative to the location of the     most recent configuration file loaded."""),     FILE_WITH_PATH, DO_NOT_RESTORE),    ("messageinfo_storage_file", _("Message information file name"), DB_TYPE[2],     _("""Spambayes builds a database of information about messages     that it has already seen and trained or classified.  This     database is used to ensure that these messages are not retrained     or reclassified (unless specifically requested to).  This option     specifies the name of the database file.  If you don't give a     full pathname, the name will be taken to be relative to the location     of the most recent configuration file loaded."""),     FILE_WITH_PATH, DO_NOT_RESTORE),    ("cache_use_gzip", _("Use gzip"), False,     _("""Use gzip to compress the cache."""),     BOOLEAN, RESTORE),    ("cache_expiry_days", _("Days before cached messages expire"), 7,     _("""Messages will be expired from the cache after this many days.     After this time, you will no longer be able to train on these messages     (note this does not affect the copy of the message that you have in     your mail client)."""),     INTEGER, RESTORE),    ("spam_cache", _("Spam cache directory"), "pop3proxy-spam-cache",     _("""Directory that SpamBayes should cache spam in.  If this does     not exist, it will be created."""),     PATH, DO_NOT_RESTORE),    ("ham_cache", _("Ham cache directory"), "pop3proxy-ham-cache",     _("""Directory that SpamBayes should cache ham in.  If this does     not exist, it will be created."""),     PATH, DO_NOT_RESTORE),    ("unknown_cache", _("Unknown cache directory"), "pop3proxy-unknown-cache",     _("""Directory that SpamBayes should cache unclassified messages in.     If this does not exist, it will be created."""),     PATH, DO_NOT_RESTORE),    ("cache_messages", _("Cache messages"), True,     _("""You can disable the pop3proxy caching of messages.  This     will make the proxy a bit faster, and make it use less space     on your hard drive.  The proxy uses its cache for reviewing     and training of messages, so if you disable caching you won't     be able to do further training unless you re-enable it.     Thus, you should only turn caching off when you are satisfied     with the filtering that Spambayes is doing for you."""),     BOOLEAN, RESTORE),    ("no_cache_bulk_ham", _("Suppress caching of bulk ham"), False,     _("""Where message caching is enabled, this option suppresses caching     of messages which are classified as ham and marked as     'Precedence: bulk' or 'Precedence: list'.  If you subscribe to a     high-volume mailing list then your 'Review messages' page can be     overwhelmed with list messages, making training a pain.  Once you've     trained Spambayes on enough list traffic, you can use this option     to prevent that traffic showing up in 'Review messages'."""),     BOOLEAN, RESTORE),    ("no_cache_large_messages", _("Maximum size of cached messages"), 0,     _("""Where message caching is enabled, this option suppresses caching     of messages which are larger than this value (measured in bytes).     If you receive a lot of messages that include large attachments     (and are correctly classified), you may not wish to cache these.     If you set this to zero (0), then this option will have no effect."""),     INTEGER, RESTORE),  ),  # These options control the various headers that some Spambayes  # applications add to incoming mail, including imapfilter, pop3proxy,  # and hammie.  "Headers" : (    # The name of the header that hammie, pop3proxy, and any other spambayes    # software, adds to emails in filter mode.  This will definately contain    # the "classification" of the mail, and may also (i.e. with hammie)    # contain the score    ("classification_header_name", _("Classification header name"), "X-Spambayes-Classification",     _("""Spambayes classifies each message by inserting a new header into     the message.  This header can then be used by your email client     (provided your client supports filtering) to move spam into a     separate folder (recommended), delete it (not recommended), etc.     This option specifies the name of the header that Spambayes inserts.     The default value should work just fine, but you may change it to     anything that you wish."""),     HEADER_NAME, RESTORE),    # The three disposition names are added to the header as the following    # three words:    ("header_spam_string", _("Spam disposition name"), _("spam"),     _("""The header that Spambayes inserts into each email has a name,     (Classification eader name, above), and a value.  If the classifier     determines that this email is probably spam, it places a header named     as above with a value as specified by this string.  The default     value should work just fine, but you may change it to anything     that you wish."""),     HEADER_VALUE, RESTORE),    ("header_ham_string", _("Ham disposition name"), _("ham"),     _("""As for Spam Designation, but for emails classified as Ham."""),     HEADER_VALUE, RESTORE),    ("header_unsure_string", _("Unsure disposition name"), _("unsure"),     _("""As for Spam/Ham Designation, but for emails which the     classifer wasn't sure about (ie. the spam probability fell between     the Ham and Spam Cutoffs).  Emails that have this classification     should always be the subject of training."""),     HEADER_VALUE, RESTORE),    ("header_score_digits", _("Accuracy of reported score"), 2,     _("""Accuracy of the score in the header in decimal digits."""),     INTEGER, RESTORE),    ("header_score_logarithm", _("Augment score with logarithm"), False,     _("""Set this option to augment scores of 1.00 or 0.00 by a     logarithmic "one-ness" or "zero-ness" score (basically it shows the     "number of zeros" or "number of nines" next to the score value)."""),     BOOLEAN, RESTORE),    ("include_score", _("Add probability (score) header"), False,     _("""You can have Spambayes insert a header with the calculated spam     probability into each mail.  If you can view headers with your     mailer, then you can see this information, which can be interesting     and even instructive if you're a serious SpamBayes junkie."""),     BOOLEAN, RESTORE),    ("score_header_name", _("Probability (score) header name"), "X-Spambayes-Spam-Probability",     _(""""""),     HEADER_NAME, RESTORE),    ("include_thermostat", _("Add level header"), False,     _("""You can have spambayes insert a header with the calculated spam     probability, expressed as a number of '*'s, into each mail (the more     '*'s, the higher the probability it is spam). If your mailer     supports it, you can use this information to fine tune your     classification of ham/spam, ignoring the classification given."""),     BOOLEAN, RESTORE),    ("thermostat_header_name", _("Level header name"), "X-Spambayes-Level",     _(""""""),     HEADER_NAME, RESTORE),    ("include_evidence", _("Add evidence header"), False,     _("""You can have spambayes insert a header into mail, with the     evidence that it used to classify that message (a collection of     words with ham and spam probabilities).  If you can view headers     with your mailer, then this may give you some insight as to why     a particular message was scored in a particular way."""),     BOOLEAN, RESTORE),    ("evidence_header_name", _("Evidence header name"), "X-Spambayes-Evidence",     _(""""""),     HEADER_NAME, RESTORE),    ("mailid_header_name", _("Spambayes id header name"), "X-Spambayes-MailId",     _(""""""),     HEADER_NAME, RESTORE),    ("include_trained", _("Add trained header"), True,     _("""sb_mboxtrain.py and sb_filter.py can add a header that details     how a message was trained, which lets you keep track of it, and     appropriately re-train messages.  However, if you would rather     mboxtrain/sb_filter didn't rewrite the message files, you can disable     this option."""),     BOOLEAN, RESTORE),    ("trained_header_name", _("Trained header name"), "X-Spambayes-Trained",     _("""When training on a message, the name of the header to add with how     it was trained"""),     HEADER_NAME, RESTORE),    ("clue_mailheader_cutoff", _("Debug header cutoff"), 0.5,     _("""The range of clues that are added to the "debug" header in the     E-mail. All clues that have their probability smaller than this number,     or larger than one minus this number are added to the header such that     you can see why spambayes thinks this is ham/spam or why it is unsure.     The default is to show all clues, but you can reduce that by setting     showclue to a lower value, such as 0.1"""),     REAL, RESTORE),    ("add_unique_id", _("Add unique spambayes id"), True,     _("""If you wish to be able to find a specific message (via the 'find'     box on the home page), or use the SMTP proxy to train using cached     messages, you will need to know the unique id of each message.  This     option adds this information to a header added to each message."""),     BOOLEAN, RESTORE),    ("notate_to", _("Notate to"), (),     _("""Some email clients (Outlook Express, for example) can only set up     filtering rules on a limited set of headers.  These clients cannot     test for the existence/value of an arbitrary header and filter mail     based on that information.  To accommodate these kind of mail clients,     you can add "spam", "ham", or "unsure" to the recipient list.  A     filter rule can then use this to see if one of these words (followed     by a comma) is in the recipient list, and route the mail to an     appropriate folder, or take whatever other action is supported and     appropriate for the mail classification.     As it interferes with replying, you may only wish to do this for     spam messages; simply tick the boxes of the classifications take     should be identified in this fashion."""),     ((), _("ham"), _("spam"), _("unsure")), RESTORE),    ("notate_subject", _("Classify in subject: header"), (),     _("""This option will add the same information as 'Notate To',     but to the start of the mail subject line."""),     ((), _("ham"), _("spam"), _("unsure")), RESTORE),  ),  # pop3proxy settings: The only mandatory option is pop3proxy_servers, eg.  # "pop3.my-isp.com:110", or a comma-separated list of those.  The ":110"  # is optional.  If you specify more than one server in pop3proxy_servers,  # you must specify the same number of ports in pop3proxy_ports.
💿 文件大小 1791 K
👤 上传用户 guigong
📂 所属分类数学计算
🏷️ 相关标签

#python #邮件过滤
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -