⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 odeum.h

📁 harvest是一个下载html网页得机器人
💻 H
📖 第 1 页 / 共 2 页
字号:
/************************************************************************************************* * The inverted API of QDBM *                                                      Copyright (C) 2000-2003 Mikio Hirabayashi * This file is part of QDBM, Quick Database Manager. * QDBM is free software; you can redistribute it and/or modify it under the terms of the GNU * Lesser General Public License as published by the Free Software Foundation; either version * 2.1 of the License or any later version.  QDBM is distributed in the hope that it will be * useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more * details. * You should have received a copy of the GNU Lesser General Public License along with QDBM; if * not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA * 02111-1307 USA. *************************************************************************************************/#ifndef _ODEUM_H                         /* duplication check */#define _ODEUM_H#include <depot.h>#include <curia.h>#include <cabin.h>#include <villa.h>#include <stdlib.h>/************************************************************************************************* * API *************************************************************************************************/typedef struct {                         /* type of structure for a database handle */  char *name;                            /* name of the database directory */  int wmode;                             /* whether writable or not */  int fatal;                             /* whether a fatal error occured or not */  int inode;                             /* inode of the database directory */  CURIA *docsdb;                         /* database handle for documents */  CURIA *indexdb;                        /* database handle for the inverted index */  VILLA *rdocsdb;                        /* database handle for reverse dictionary */  CBMAP *sortmap;                        /* map handle for candidates of sorting */  int dmax;                              /* max number of the document ID */  int dnum;                              /* number of the documents */} ODEUM;typedef struct {                         /* type of structure for a document handle */  int id;                                /* ID number */  char *uri;                             /* uniform resource identifier */  CBMAP *attrs;                          /* map handle for attrubutes */  CBLIST *nwords;                        /* list handle for words in normalized form */  CBLIST *awords;                        /* list handle for words in appearance form */} ODDOC;typedef struct {                         /* type of structure for an element of search result */  int id;                                /* ID number of the document */  int score;                             /* score of the document */} ODPAIR;enum {                                   /* enumeration for open modes */  OD_OREADER = 1 << 0,                   /* open as a reader */  OD_OWRITER = 1 << 1,                   /* open as a writer */  OD_OCREAT = 1 << 2,                    /* a writer creating */  OD_OTRUNC = 1 << 3,                    /* a writer truncating */  OD_ONOLCK = 1 << 4                     /* open without locking */};/* Get a database handle.   `name' specifies the name of a database directory.   `omode' specifies the connection mode: `OD_OWRITER' as a writer, `OD_OREADER' as a reader.   If the mode is `OD_OWRITER', the following may be added by bitwise or: `OD_OCREAT', which   means it creates a new database if not exist, `OD_OTRUNC', which means it creates a new   database regardless if one exists.  Both of `OD_OREADER' and `OD_OWRITER' can be added to by   bitwise or: `OD_ONOLCK', which means it opens a database directory without file locking.   The return value is the database handle or `NULL' if it is not successful.   While connecting as a writer, an exclusive lock is invoked to the database directory.   While connecting as a reader, a shared lock is invoked to the database directory.   The thread blocks until the lock is achieved.  If `OD_ONOLCK' is used, the application is   responsible for exclusion control. */ODEUM *odopen(const char *name, int omode);/* Close a database handle.   `odeum' specifies a database handle.   If successful, the return value is true, else, it is false.   Because the region of a closed handle is released, it becomes impossible to use the handle.   Updating a database is assured to be written when the handle is closed.  If a writer opens   a database but does not close it appropriately, the database will be broken. */int odclose(ODEUM *odeum);/* Store a document.   `odeum' specifies a database handle connected as a writer.   `doc' specifies a document handle.   `wmax' specifies the max number of words to be stored in the document database.  If it is   negative, the number is unlimited.   `over' specifies whether the data of the duplicated document is overwritten or not.  If it   is false and the URI of the document is duplicated, the function returns as an error.   If successful, the return value is true, else, it is false. */int odput(ODEUM *odeum, ODDOC *doc, int wmax, int over);/* Delete a document specified by a URI.   `odeum' specifies a database handle connected as a writer.   `uri' specifies the string of the URI of a document.   If successful, the return value is true, else, it is false.  False is returned when no   document corresponds to the specified URI. */int odout(ODEUM *odeum, const char *uri);/* Delete a document specified by an ID number.   `odeum' specifies a database handle connected as a writer.   `id' specifies the ID number of a document.   If successful, the return value is true, else, it is false.  False is returned when no   document corresponds to the specified ID number. */int odoutbyid(ODEUM *odeum, int id);/* Retrieve a document specified by a URL.   `odeum' specifies a database handle.   `uri' specifies the string the URI of a document.   If successful, the return value is the handle of the corresponding document, else, it is   `NULL'.  `NULL' is returned when no document corresponds to the specified URI.   Because the handle of the return value is opened with the function `oddocopen', it should   be closed with the function `oddocclose'. */ODDOC *odget(ODEUM *odeum, const char *uri);/* Retrieve a document by an ID number.   `odeum' specifies a database handle.   `id' specifies the ID number of a document.   If successful, the return value is the handle of the corresponding document, else, it is   `NULL'.  `NULL' is returned when no document corresponds to the specified ID number.   Because the handle of the return value is opened with the function `oddocopen', it should   be closed with the function `oddocclose'. */ODDOC *odgetbyid(ODEUM *odeum, int id);/* Search the inverted index for documents including a particular word.   `odeum' specifies a database handle.   `word' specifies a searching word.   `max' specifies the max number of documents to be retrieve.   `np' specifies the pointer to a variable to which the number of the elements of the return   value is assigned.   If successful, the return value is the pointer to an array, else, it is `NULL'.  Each   element of the array is a pair of the ID number and the score of a document, and sorted in   descending order of their scores.  Even if no document corresponds to the specified word,   it is not error but returns an dummy array.   Because the region of the return value is allocated with the `malloc' call, it should be   released with the `free' call if it is no longer in use.  Note that each element of the array   of the return value can be data of a deleted document. */ODPAIR *odsearch(ODEUM *odeum, const char *word, int max, int *np);/* Get the number of documents including a word.   `odeum' specifies a database handle.   `word' specifies a searching word.   If successful, the return value is the number of documents including the word, else, it is -1.   Because this function does not read the entity of the inverted index, it is faster than   `odsearch'. */int odsearchdnum(ODEUM *odeum, const char *word);/* Initialize the iterator of a database handle.   `odeum' specifies a database handle.   If successful, the return value is true, else, it is false.   The iterator is used in order to access every document stored in a database. */int oditerinit(ODEUM *odeum);/* Get the next key of the iterator.   `odeum' specifies a database handle.   If successful, the return value is the handle of the next document, else, it is `NULL'.   `NULL' is returned when no document is to be get out of the iterator.   It is possible to access every document by iteration of calling this function.  However,   it is not assured if updating the database is occurred while the iteration.  Besides, the   order of this traversal access method is arbitrary, so it is not assured that the order of   string matches the one of the traversal access.  Because the handle of the return value is   opened with the function `oddocopen', it should be closed with the function `oddocclose'. */ODDOC *oditernext(ODEUM *odeum);/* Synchronize updating contents with the files and the devices.   `odeum' specifies a database handle connected as a writer.   If successful, the return value is true, else, it is false.   This function is useful when another process uses the connected database directory. */int odsync(ODEUM *odeum);/* Optimize a database.   `odeum' specifies a database handle connected as a writer.   If successful, the return value is true, else, it is false.   Elements of the deleted documents in the inverted index are purged. */int odoptimize(ODEUM *odeum);/* Get the name of a database.   `odeum' specifies a database handle.   If successful, the return value is the pointer to the region of the name of the database,   else, it is `NULL'.   Because the region of the return value is allocated with the `malloc' call, it should be   released with the `free' call if it is no longer in use. */char *odname(ODEUM *odeum);/* Get the total size of database files.   `odeum' specifies a database handle.   If successful, the return value is the total size of the database files, else, it is -1. */int odfsiz(ODEUM *odeum);/* Get the total number of the elements of the bucket arrays used in the inverted index.   `odeum' specifies a database handle.   If successful, the return value is the total number of the elements of the bucket arrays,   else, it is -1. */int odbnum(ODEUM *odeum);/* Get the number of the documents stored in a database.   `odeum' specifies a database handle.   If successful, the return value is the number of the documents stored in the database, else,   it is -1. */int oddnum(ODEUM *odeum);/* Get the number of the words stored in a database.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -