⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 odeum.h

📁 harvest是一个下载html网页得机器人
💻 H
📖 第 1 页 / 共 2 页
字号:
   `odeum' specifies a database handle.   If successful, the return value is the number of the words stored in the database, else,   it is -1. */int odwnum(ODEUM *odeum);/* Check whether a database handle is a writer or not.   `odeum' specifies a database handle.   The return value is true if the handle is a writer, false if not. */int odwritable(ODEUM *odeum);/* Check whether a database has a fatal error or not.   `odeum' specifies a database handle.   The return value is true if the database has a fatal error, false if not. */int odfatalerror(ODEUM *odeum);/* Get the inode number of a database directory.   `odeum' specifies a database handle.   The return value is the inode number of the database directory. */int odinode(ODEUM *odeum);/* Remove a database directory.   `name' specifies the name of a database directory.   If successful, the return value is true, else, it is false.   A database directory can contain databases of other APIs of QDBM, they are also removed by   this function. */int odremove(const char *name);/* Get a document handle.   `uri' specifies the URI of a document.   The return value is a document handle.   The ID number of a new document is not defined.  It is defined when the document is stored   in a database. */ODDOC *oddocopen(const char *uri);/* Close a document handle.   `doc' specifies a document handle.   Because the region of a closed handle is released, it becomes impossible to use the handle. */void oddocclose(ODDOC *doc);/* Add an attribute to a document.   `doc' specifies a document handle.   `name' specifies the string of the name of an attribute.   `value' specifies the string of the value of the attribute. */void oddocaddattr(ODDOC *doc, const char *name, const char *value);/* Add a word to a document.   `doc' specifies a document handle.   `normal' specifies the string of the normalized form of a word.  Normalized forms are   treated as keys of the inverted index.  If the normalized form of a word is an empty   string, the word is not reflected in the inverted index.   `asis' specifies the string of the appearance form of the word.  Appearance forms are used   after the document is retrieved by an application. */void oddocaddword(ODDOC *doc, const char *normal, const char *asis);/* Get the ID number of a document.   `doc' specifies a document handle.   The return value is the ID number of a document. */int oddocid(const ODDOC *doc);/* Get the URI of a document.   `doc' specifies a document handle.   The return value is the string of the URI of a document. */const char *oddocuri(const ODDOC *doc);/* Get the value of an attribute of a document.   `doc' specifies a document handle.   `name' specifies the string of the name of an attribute.   The return value is the string of the value of the attribute, or `NULL' if no attribute   corresponds. */const char *oddocgetattr(const ODDOC *doc, const char *name);/* Get the list handle contains words in normalized form of a document.   `doc' specifies a document handle.   The return value is the list handle contains words in normalized form. */const CBLIST *oddocnwords(const ODDOC *doc);/* Get the list handle contains words in appearance form of a document.   `doc' specifies a document handle.   The return value is the list handle contains words in appearance form. */const CBLIST *oddocawords(const ODDOC *doc);/* Get the map handle contains keywords in normalized form and their scores.   `doc' specifies a document handle.   `max' specifies the max number of keywords to get.   `odeum' specifies a database handle with which the IDF for weighting is calculate.   If it is `NULL', it is not used.   The return value is the map handle contains keywords and their scores.  Scores are expressed   as decimal strings.   Because the handle of the return value is opened with the function `cbmapopen', it should   be closed with the function `cbmapclose' if it is no longer in use. */CBMAP *oddocscores(const ODDOC *doc, int max, ODEUM *odeum);/* Break a text into words in appearance form.   `text' specifies the string of a text.   The return value is the list handle contains words in appearance form.   Words are speparated with space characters and such delimiters as period, comma and so on.   Because the handle of the return value is opened with the function `cblistopen', it should   be closed with the function `cblistclose' if it is no longer in use. */CBLIST *odbreaktext(const char *text);/* Make the normalized form of a word.   `asis' specifies the string of the appearance form of a word.   The return value is is the string of the normalized form of the word.   Alphabets of the ASCII code are unified into lower cases.  Words combosed of only delimiters   are treated as empty strings.  Because the region of the return value is allocated with the   `malloc' call, it should be released with the `free' call if it is no longer in use. */char *odnormalizeword(const char *asis);/* Get the common elements of two sets of documents.   `apairs' specifies the pointer to the former document array.   `anum' specifies the number of the elements of the former document array.   `bpairs' specifies the pointer to the latter document array.   `bnum' specifies the number of the elements of the latter document array.   `np' specifies the pointer to a variable to which the number of the elements of the return   value is assigned.   The return value is the pointer to a new document array whose elements commonly belong to   the specified two sets.   Elements of the array are sorted in descending order of their scores.  Because the region of   the return value is allocated with the `malloc' call, it should be released with the `free'   call if it is no longer in use. */ODPAIR *odpairsand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum, int *np);/* Get the sum of elements of two sets of documents.   `apairs' specifies the pointer to the former document array.   `anum' specifies the number of the elements of the former document array.   `bpairs' specifies the pointer to the latter document array.   `bnum' specifies the number of the elements of the latter document array.   `np' specifies the pointer to a variable to which the number of the elements of the return   value is assigned.   The return value is the pointer to a new document array whose elements belong to both or   either of the specified two sets.   Elements of the array are sorted in descending order of their scores.  Because the region of   the return value is allocated with the `malloc' call, it should be released with the `free'   call if it is no longer in use. */ODPAIR *odpairsor(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum, int *np);/* Get the difference set of documents.   `apairs' specifies the pointer to the former document array.   `anum' specifies the number of the elements of the former document array.   `bpairs' specifies the pointer to the latter document array of the sum of elements.   `bnum' specifies the number of the elements of the latter document array.   `np' specifies the pointer to a variable to which the number of the elements of the return   value is assigned.   The return value is the pointer to a new document array whose elements belong to the former   set but not to the latter set.   Elements of the array are sorted in descending order of their scores.  Because the region of   the return value is allocated with the `malloc' call, it should be released with the `free'   call if it is no longer in use. */ODPAIR *odpairsnotand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum, int *np);/* Sort a set of documents in descending order of scores.   `pairs' specifies the pointer to a document array.   `pnum' specifies the number of the elements of the document array. */void odpairssort(ODPAIR *pairs, int pnum);/* Get the natural logarithm of a number.   `x' specifies a number.   The return value is the natural logarithm of the number.  If the number is equal to or less   than 1.0, the return value is 0.0.   This function is useful when an application calculates the IDF of search results. */double odlogarithm(double x);/* Get the cosine of the angle of two vectors.   `avec' specifies the pointer to one array of numbers.   `bvec' specifies the pointer to the other array of numbers.   `vnum' specifies the number of elements of each array.   The return value is the cosine of the angle of two vectors.   This function is useful when an application calculates similarity of documents. */double odvectorcosine(const int *avec, const int *bvec, int vnum);/************************************************************************************************* * Functions for Experts *************************************************************************************************//* Get the positive one of square roots of a number.   `x' specifies a number.   The return value is the positive one of square roots of a number.  If the number is equal to   or less than 0.0, the return value is 0.0. */double odsquareroot(double x);/* Get the absolute of a vector.   `vec' specifies the pointer to an array of numbers.   `vnum' specifies the number of elements of the array.   The return value is the absolute of a vector. */double odvecabsolute(const int *vec, int vnum);/* Get the inner product of two vectors.   `avec' specifies the pointer to one array of numbers.   `bvec' specifies the pointer to the other array of numbers.   `vnum' specifies the number of elements of each array.   The return value is the inner product of two vectors. */double odvecinnerproduct(const int *avec, const int *bvec, int vnum);#endif                                   /* duplication check *//* END OF FILE */

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -