libbow-desc.texi

来自「良好的代码实现」· TEXI 代码 · 共 41 行

TEXI

41 行

@samp{Libbow} is a library of C code intended for writing statisticaltext-processing programs.  This distribution includes the library, aswell as a text classification front-end, and a document retrievalfront-end.@formatThe library provides facilities for:        Recursively descending directories, finding text files.        Finding `document' boundaries when there are multiple docs per file.        Tokenizing a text file, according to several different methods.        Including N-grams among the tokens.        Mapping strings to integers and back again, very efficiently.        Building a sparse matrix of document/token counts.        Pruning vocabulary by occurrence counts or by information gain.        Building and manipulating word vectors.        Setting word vector weights according to NaiveBayes, TFIDF, and a          simple form of Probabilistic Indexing.        Scoring queries for retrieval or classification.        Writing all data structures to disk in a machine-architecture-          independent format.        Reading the document/token matrix from disk in an efficient,           sparse fashion.        Performing test/train splits, and automatic classification tests.@end format        It should compile on most UNIX systems, and WindowsNT (with a GNU buildenvironment).The code conforms to the GNU coding standards.  It is released under theLibrary GNU Public License.@formatThe library does not:        Have parsing facilities.        Do smoothing across N-gram models.        Claim to be finished.        Have good documentation.        Claim to be bug-free.        ...many other things.@end format

libbow-desc.texi - 源码说明

本页面展示了「良好的代码实现」中的 libbow-desc.texi 源码文件，采用 TEXI 编程语言编写，共 41 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫开发者社区收录了大量与贝叶斯网络相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?