📄 readme.db
字号:
BERKELEY DB ENVIRONMENT CODE============================$Id: README.db,v 1.37 2006/01/13 01:03:36 m-a Exp $This document does not apply when you are installing a bogofilterversion that has been configured to use the QDBM or SQLite3database libraries. Common entry points are theseQUICK LINKS:------------Upgrade of Berkeley DB or operating system section 2.6Switching transactions on or off section 2.2.1 or 2.2.2Recover broken database section 3.2 or 3.30. Definitions ---------------------------------------------------------Whenever ~/.bogofilter appears in the text below, this is the directorywhere bogofilter keeps its data base by default. If you are overridingthis directory by configuration or environment variables, replace youractual bogofilter data base directory.1. Overview ------------------------------------------------------------Operating bogofilter with a Berkeley DB back-end requires someattention. Bogofilter offers two operating modes that can be switched onor off a directory-by-directory basis: - the traditional, simple Berkeley DB Data Store easy to use, but also easy to corrupt, an interrupted registration of spam or non-spam mail, an application crash, a system crash, a disk that runs out of space, a user that runs out of quota. - the Berkeley DB Transactional Data Store, transactional, TXN or XA in bogofilter jargon for short a bit more complex to use, but pretty crash-proof as long as it is not abused. Upgrading bogofilter, Berkeley DB, copying around databases and backups require special considerations given in this document.Note that the system administrator can disable either mode of operationat compile time; in the default install, either will be present.2. Prerequisites and Caveats -------------------------------------------2.1 Compatibility, Berkeley DB versionsThese versions are supported (with all Sleepycat-posted patches applied- if using a pre-packaged Berkeley DB version, the packager should haveapplied the patches, check your vendor's update site regularly): Sleepycat Software: Berkeley DB 3.1.17: (July 31, 2000) Sleepycat Software: Berkeley DB 3.2.9: (January 24, 2001) Sleepycat Software: Berkeley DB 3.3.11: (July 12, 2001) Sleepycat Software: Berkeley DB 4.0.14: (November 18, 2001) Sleepycat Software: Berkeley DB 4.1.25: (December 19, 2002) Sleepycat Software: Berkeley DB 4.2.52: (December 3, 2003) Sleepycat Software: Berkeley DB 4.3.29: (September 6, 2005) Sleepycat Software: Berkeley DB 4.4.20: (January 10, 2006)Other versions of Berkeley DB between the first and last listed abovemay or may not work but usually they will.Berkeley DB versions 4.1 and newer are recommended over the previousversions, because the newer can detect data corruptions more reliably(through the use of checksums that detect partially written data basepages); Berkeley DB 4.2 and 4.3 appear a bit faster under load than 4.1and older versions, 4.4 has not yet been evaluated.2.2.1 Upgrading to transactional databases, also from older bogofilter versionsNOTE: for updates of Berkeley DB itself, see section 2.6!Bogofilter should transparently upgrade the existing data base to thenew transactional data base, but going the reverse way requires manualintervention, see section 2.2.2 below.For enhanced reliability (only available with Berkeley DB 4.1 or newer),it is recommended that you dump and reload the database once so thatBerkeley DB adds page checksums. Skip this procedure for Berkeley DBversions 4.0 or older. For 4.1 and newer, use these commands: cd ~/.bogofilter bogoutil -d wordlist.db > wordlist.txt mv wordlist.db wordlist.db.old bogoutil --db-transaction=yes -l wordlist.db < wordlist.txtAnd if all commands succeeded: rm wordlist.txt wordlist.db.old2.2.2 Downgrading to non-transactional databaseTo downgrade a transactional database to a non-transactional, adump/reload cycle is required. These commands should do the job: cd ~/.bogofilter bogoutil -d wordlist.db > wordlist.txt mv wordlist.db wordlist.db.old rm -f log.?????????? __db.??? bogoutil --db-transaction=no -l wordlist.db < wordlist.txt2.3 RecoverabilityThe ability to recover the data base after a crash (power failure!)depends on data being written to the disk (or a battery-backed writecache) _immediately_ rather than delayed to be written later.Common disk drives in current PCs and MACs are of the ATA or SATA kindand usually ship with their write cache enabled. They write fast, butcan lose or corrupt up to a few MB of data when the power fails.Note: This problem is not specific to bogofilter.It is possible to sacrifice a bit bit of the the write speed and getreliability in turn, by switching off the disk's write cache (seeappendix A for instructions).Switching the write cache off may however adversely affect theperformance below acceptable levels, particularly for large writes suchas recording live audio or video data to hard disk.If performance is degraded too much, consider getting a separate diskdrive and using one for fast writes (with the write cache on) and onefor reliable writes (with the write cache off, for bogofilter, mailservers and other applications that need survive a power loss withoutdata loss).2.4 Choosing a file systemIf your computer saves the data on its own disk drive (a "local filesystem"), Berkeley DB should work fine. Such file systems are ext2, ext3,ffs, jfs, hfs, hfs+, reiserfs, ufs, xfs.Berkeley DB Transactional and Concurrent data stores do not workreliably with a networked file system for various technical reasons.AFS, CIFS, Coda, NFS, SMBFS fall into this category.Strictly speaking, with Berkeley DB 4.0 and older versions, the data baseblock size must be written atomically. The bogofilter maintainers arenot currently aware of a file system that meets this requirement and isproduction quality at the same time.2.5 Making a snapshot backupThe transactional data store is no good if the disk drive has becomeinaccessible (which happens after some months or years with everydrive), so you _must_ back up your data base regularly (see thedb_archive utility manual for additional documentation of a "hot"backup), bogofilter cannot, of course, guess data that got lost througha hard drive fault.When copying or archiving directory contents, be sure to copy or archivethe *.db files FIRST, BEFORE archiving/copying the log.* files, thisis needed to let the database copy or archive remain recoverable.You can use the bf_tar script for convenience. It requires the paxutility (UNIX standard for over a decade now) and writes a tar archiveto stdout, and it can optionally remove inactive log files before orafter writing the tarball.Run bf_tar without arguments to see its synopsis.2.6 Updating Berkeley DB version underneath bogofilterWhen upgrading the Berkeley DB library to a new version, or recompilingbogofilter to use a newer version, two things in the on-disk data formatcan change, generally speaking: the database format (we use the Btreeaccess method), the log file format, or both.You need a "log file upgrade" for your transactional databases if atleast one of these conditions is true ("-->" means "to")- you upgraded Berkeley DB from a 3.X version --> a 4.Y version- you upgraded Berkeley DB from 4.0 or 4.1 --> 4.2, 4.3 or 4.4- you upgraded Berkeley DB from 4.2 --> 4.3 or 4.4- you upgraded Berkeley DB from 4.3 --> 4.4Non-transactional databases do not need log file format upgrades as theydo not use log files.If you need a log file upgrade, the upgrade procedure is:(NOTE: DO NOT UPGRADE BERKELEY DB OR BOGOFILTER UNTIL STEP 4!) 1. shut down your mail system, 2. run 'bogoutil --db-remove-environment ~/.bogofilter' (for each user!) (this will automatically recover the environment) 3. archive the database for catastrophic recovery (make a backup) 4. install the new Berkeley DB version, recompile bogofilter (unless using a binary package), install the new bogofilter 5. start your mail system.If you've been using Berkeley DB 3.0 (only supported with bogofilterversions 0.17.2 and older) and are about to update to any newer 3.X or4.X version, you need a database format upgrade. You can either:- dump the wordlists to a text file, then update Berkeley DB and bogofilter, remove the wordlists and load them from the dump. The procedure is the same as in section 2.2.1 or 2.2.2 (depending on whether you want to use transactions or not).or- use the db_upgrade utility to upgrade the databases in place (this is dangerous and must not be interrupted, backup first!)3. Use and troubleshooting ---------------------------------------------3.1 LOG FILE HANDLINGThis section only applies to transactional databases.The Berkeley DB Transactional Data Store uses log files to store database changes so that they can be rolled back or restored after anapplication crash.The logs of the transactional data store, log.NNNNNNNNNN files of up to1 MB in size (in the default configuration), can consume considerableamounts of disk space. Bogofilter therefore removes log files that areno longer part of active transactions by default.This automatic removal of log files can be disabled by either using--db-log-autoremove=no on the command line, or by configuringdb_log_autoremove=no in the configuration file. If automatic removal isdisabled, it can manually be triggered by running'bogoutil --db-prune ~/.bogofilter'.Note that removing log files makes catastrophic recovery impossiblewithout backups, so you *must* make snapshotbackups that contain both the *.db and log.* files (in this order!). Seesection 2.5 for details.Referral: Berkeley DB's db_archive documentation contains suggestionsfor several backup strategies.3.2 RECOVERY AND FAILED RECOVERY, FOR TRANSACTIONAL DATABASES ONLYThe recovery procedures should be tried in the order shown in thissection. If you aren't willing to experiment much, but have keptsufficient spam and ham that you can easily and quickly retrainbogofilter from scratch, read only sections 3.2.1 and 3.2.5, skippingsubsections 3.2.2 to 3.2.4.3.2.1 Regular recoveryBogofilter and related bogo* utilities will automatically detect whenregular recovery is needed and run it. This process is transparent,the user will usually be aware this happens.This process needs the *.db file and the corresponding _active_ logfiles.It is possible to trigger regular recovery by runningbogoutil --db-recover ~/.bogofilteralthough this should not be needed.If this fails, remove the __db.*, *.db and log.* files,restore from the latest snapshot backup (see section 2.5) and forcerecovery as shown in the previous paragraph.3.2.2 Catastrophic recoveryIf regular recovery fails after severe damage to hardware, filesystem,database files, you can attempt to run catastrophic recovery. If logfiles have been damaged, catastrophic recovery may not work.This may need *all* log files from the backup and is therefore notavailable if log files have been pruned.To run catastrophic recovery, replace the log files from your archives,then run:bogoutil --db-recover-harder ~/.bogofilterIf this fails, read on.3.2.3 Last-resort recovery method #1: nuke the environmentThis recovery methods do not guarantee you are getting all of yourdatabase, you may already have lost part or all of your data when thisis required, and you may lose recent updates to the database and corruptit. Only attempt this methods if the regular and catastrophic recoverymethods have failed or were unavailable.To use this method: 1. remove the __db.* and log.* files 2. run: bogoutil -v --db-verify ~/.bogofilter/wordlist.db 3a. if that printed OK, watch carefully if bogofilter performs to the standards you are used to. 3b. if verify failed, read the next section.3.2.4 Last-resort recovery method #2: salvage the raw .db fileThis recovery methods do not guarantee you are getting all of yourdatabase, you may already have lost part or all of your data when thisis required, and you may lose recent updates to the database and corrupt it. Only attempt this methods if the regular and catastrophic recovery methods have failed or were unavailable.To try this method: # salvage raw data db_dump -r ~/.bogofilter/wordlist.db > ~/.bogofilter/wordlist.saved rm ~/.bogofilter/{__db.*,log.*,wordlist.db} db_load ~/.bogofilter/wordlist.db < ~/.bogofilter/wordlist.saved # convert to transactional store bogoutil -d ~/.bogofilter/wordlist.db > ~/.bogofilter/wordlist.saved rm ~/.bogofilter/{__db.*,log.*,wordlist.db} bogoutil --db-transaction=yes -l ~/.bogofilter/wordlist.db \ < ~/.bogofilter/wordlist.saved3.2.5 No recovery possible?We're sorry. This should happen really rarely. There's nothing left totry, so you need to retrain from scratch.First, remove the database and environment, type: rm ~/.bogofilter/{__db.*,log.*,wordlist.db}Then retrain from scratch with the usual bogofilter -n and bogofilter -scommands.3.3 RECOVERY OF TRADITIONAL (NON-TRANSACTIONAL) DATABASES
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -