prob.c
来自「一个C语言写的快速贝叶斯垃圾邮件过滤工具」· C语言 代码 · 共 54 行
C
54 行
/* $Id: prob.c,v 1.11 2005/01/17 03:00:48 relson Exp $ *//*****************************************************************************NAME: prob.c -- calculate token's spamicityAUTHORS: David Relson <relson@osagesoftware.com> Matthias Andree <matthias.andree@gmx.de>******************************************************************************/#include "globals.h"#include "prob.h"double calc_prob(uint good, uint bad, uint goodmsgs, uint badmsgs){ int n = good + bad; double fw, pw; /* http://www.linuxjournal.com/article.php?sid=6467 */ /* robs is Robinson's s parameter, the "strength of background info" */ /* robx is Robinson's x parameter, the assumed probability that * a word we don't have enough info about will be spam */ /* n is the number of messages that contain the word w */ if (n == 0#ifdef EXTRA_DOMAIN_CHECKING /* we had this in place while the ignore lists caused the * token to have "nan" counts because score.c left the * message counts at zero - #ifdef'd out for speed */ || badmsgs == 0 || goodmsgs == 0#endif ) { /* in these cases, pw would be undefined and return NaN * we substitute "we don't know", the x parameter */ fw = robx; } else { /* The original version of this code has four divisions. pw = ((bad / badmsgs) / (bad / badmsgs + good / goodmsgs)); */ /* This modified version, with 1 division, is considerably% faster. */ pw = bad * (double)goodmsgs / (bad * (double)goodmsgs + good * (double)badmsgs); fw = (robs * robx + n * pw) / (robs + n); } return fw;}
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?