hoge

来自「高性能嵌入式数据库在高并发的环境下使用最好是64位系统比较好」· 代码 · 共 1,892 行 · 第 1/5 页
TXT
1,892 行
.TH "INTRO" 3 "2008-11-08" "Man Page" "Tokyo Cabinet".SH FUNDAMENTAL SPECIFICATIONS OF TOKYO CABINET VERSION 1.SH TABLE OF CONTENTSIntroduction.brFeatures.brInstallation.brThe Utility API.brThe Hash Database API.brThe B+ Tree Database API.brThe Fixed\-length Database API.brThe Abstract Database API.brLicense.br.SH INTRODUCTION.PPTokyo Cabinet is a library of routines for managing a database.  The database is a simple data file containing records, each is a pair of a key and a value.  Every key and value is serial bytes with variable length.  Both binary data and character string can be used as a key and a value.  There is neither concept of data tables nor data types.  Records are organized in hash table, B+ tree, or fixed\-length array..PPAs for database of hash table, each key must be unique within a database, so it is impossible to store two or more records with a key overlaps.  The following access methods are provided to the database: storing a record with a key and a value, deleting a record by a key, retrieving a record by a key.  Moreover, traversal access to every key are provided, although the order is arbitrary.  These access methods are similar to ones of DBM (or its followers: NDBM and GDBM) library defined in the UNIX standard.  Tokyo Cabinet is an alternative for DBM because of its higher performance..PPAs for database of B+ tree, records whose keys are duplicated can be stored.  Access methods of storing, deleting, and retrieving are provided as with the database of hash table.  Records are stored in order by a comparison function assigned by a user.  It is possible to access each record with the cursor in ascending or descending order.  According to this mechanism, forward matching search for strings and range search for integers are realized.  Moreover, transaction is available in database of B+ tree..PPAs for database of fixed\-length array, records are stored with unique natural numbers.  It is impossible to store two or more records with a key overlaps.  Moreover, the length of each record is limited by the specified length.  Provided operations are the same as ones of hash database..PPTokyo Cabinet is written in the C language, and provided as API of C, Perl, Ruby, Java, and Lua.  Tokyo Cabinet is available on platforms which have API conforming to C99 and POSIX.  Tokyo Cabinet is a free software licensed under the GNU Lesser General Public License..SH FEATURES.PPTokyo Cabinet is the successor of QDBM and improves time and space efficiency.  This section describes the features of Tokyo Cabinet..SH THE DINOSAUR WING OF THE DBM FORK.PPTokyo Cabinet is developed as the successor of QDBM on the following purposes.  They are achieved and Tokyo Cabinet replaces QDBM..PP.RSimproves space efficiency : smaller size of database file..brimproves time efficiency : faster processing speed..brimproves parallelism : higher performance in multi\-thread environment..brimproves usability : simplified API..brimproves robustness : database file is not corrupted even under catastrophic situation..brsupports 64\-bit architecture : enormous memory space and database file are available..br.RE.PPAs with QDBM, the following three restrictions of traditional DBM: a process can handle only one database, the size of a key and a value is bounded, a database file is sparse, are cleared.  Moreover, the following three restrictions of QDBM: the size of a database file is limited to 2GB, environments with different byte orders can not share a database file, only one thread can search a database at the same time, are cleared..PPTokyo Cabinet runs very fast.  For example, elapsed time to store 1 million records is 1.5 seconds for hash database, and 2.2 seconds for B+ tree database.  Moreover, the size of database of Tokyo Cabinet is very small.  For example, overhead for a record is 16 bytes for hash database, and 5 bytes for B+ tree database.  Furthermore, scalability of Tokyo Cabinet is great.  The database size can be up to 8EB (9.22e18 bytes)..SH EFFECTIVE IMPLEMENTATION OF HASH DATABASE.PPTokyo Cabinet uses hash algorithm to retrieve records.  If a bucket array has sufficient number of elements, the time complexity of retrieval is `O(1)'.  That is, time required for retrieving a record is constant, regardless of the scale of a database.  It is also the same about storing and deleting.  Collision of hash values is managed by separate chaining.  Data structure of the chains is binary search tree.  Even if a bucket array has unusually scarce elements, the time complexity of retrieval is `O(log n)'..PPTokyo Cabinet attains improvement in retrieval by loading RAM with the whole of a bucket array.  If a bucket array is on RAM, it is possible to access a region of a target record by about one path of file operations.  A bucket array saved in a file is not read into RAM with the `read' call but directly mapped to RAM with the `mmap' call.  Therefore, preparation time on connecting to a database is very short, and two or more processes can share the same memory map..PPIf the number of elements of a bucket array is about half of records stored within a database, although it depends on characteristic of the input, the probability of collision of hash values is about 56.7% (36.8% if the same, 21.3% if twice, 11.5% if four times, 6.0% if eight times).  In such case, it is possible to retrieve a record by two or less paths of file operations.  If it is made into a performance index, in order to handle a database containing one million of records, a bucket array with half a million of elements is needed.  The size of each element is 4 bytes.  That is, if 2M bytes of RAM is available, a database containing one million records can be handled..PPTraditional DBM provides two modes of the storing operations: `insert' and `replace'.  In the case a key overlaps an existing record, the insert mode keeps the existing value, while the replace mode transposes it to the specified value.  In addition to the two modes, Tokyo Cabinet provides `concatenate' mode.  In the mode, the specified value is concatenated at the end of the existing value and stored.  This feature is useful when adding an element to a value as an array..PPGenerally speaking, while succession of updating, fragmentation of available regions occurs, and the size of a database grows rapidly.  Tokyo Cabinet deal with this problem by coalescence of dispensable regions and reuse of them, and featuring of optimization of a database.  When overwriting a record with a value whose size is greater than the existing one, it is necessary to remove the region to another position of the file.  Because the time complexity of the operation depends on the size of the region of a record, extending values successively is inefficient.  However, Tokyo Cabinet deal with this problem by alignment.  If increment can be put in padding, it is not necessary to remove the region..SH USEFUL IMPLEMENTATION OF B+ TREE DATABASE.PPAlthough B+ tree database is slower than hash database, it features ordering access to each record.  The order can be assigned by users.  Records of B+ tree are sorted and arranged in logical pages.  Sparse index organized in B tree that is multiway balanced tree are maintained for each page.  Thus, the time complexity of retrieval and so on is `O(log n)'.  Cursor is provided to access each record in order.  The cursor can jump to a position specified by a key and can step forward or backward from the current position.  Because each page is arranged as double linked list, the time complexity of stepping cursor is `O(1)'..PPB+ tree database is implemented, based on above hash database.  Because each page of B+ tree is stored as each record of hash database, B+ tree database inherits efficiency of storage management of hash database.  Because the header of each record is smaller and alignment of each page is adjusted according to the page size, in most cases, the size of database file is cut by half compared to one of hash database.  Although operation of many pages are required to update B+ tree, QDBM expedites the process by caching pages and reducing file operations.  In most cases, because whole of the sparse index is cached on memory, it is possible to retrieve a record by one or less path of file operations..PPB+ tree database features transaction mechanism.  It is possible to commit a series of operations between the beginning and the end of the transaction in a lump, or to abort the transaction and perform rollback to the state before the transaction.  Two isolation levels are supported; serializable and read uncommitted..PPEach pages of B+ tree can be stored with compressed.  Two compression method; Deflate of ZLIB and Block Sorting of BZIP2, are supported.  Because each record in a page has similar patterns, high efficiency of compression is expected due to the Lempel\-Ziv or the BWT algorithms.  In case handling text data, the size of a database is reduced to about 25%.  If the scale of a database is large and disk I/O is the bottleneck, featuring compression makes the processing speed improved to a large extent..SH NAIVE IMPLEMENTATION OF FIXED\-LENGTH DATABASE.PPFixed\-length database has restrictions that each key should be a natural number and that the length of each value is limited.  However, time efficiency and space efficiency are higher than the other data structures as long as the use case is within the restriction..PPBecause the whole region of the database is mapped on memory by the `mmap' call and referred as a multidimensional array, the overhead related to the file I/O is minimized.  Due to this simple structure, fixed\-length database works faster than hash database, and its concurrency in multi\-thread environment is prominent..PPThe size of the database is proportional to the range of keys and the limit size of each value.  That is, the smaller the range of keys is or the smaller the length of each value is, the higher the space efficiency is.  For example, if the maximum key is 1000000 and the limit size of the value is 100 bytes, the size of the database will be about 100MB.  Because regions around referred records are only loaded on the RAM, you can increase the size of the database to the size of the virtual memory..SH SIMPLE BUT VARIOUS INTERFACES.PPTokyo Cabinet provides simple API based on the object oriented design.  Every operation for database is encapsulated and published as lucid methods as `open' (connect), `close' (disconnect), `put' (insert), `out' (remove), `get' (retrieve), and so on.  Because the three of hash, B+ tree, and fixed\-length array database APIs are very similar with each other, porting an application from one to the other is easy..PPTokyo Cabinet provides two modes to connect to a database: `reader' and `writer'.  A reader can perform retrieving but neither storing nor deleting.  A writer can perform all access methods.  Exclusion control between processes is performed when connecting to a database by file locking.  While a writer is connected to a database, neither readers nor writers can be connected.  While a reader is connected to a database, other readers can be connect, but writers can not.  According to this mechanism, data consistency is guaranteed with simultaneous connections in multitasking environment..PPFunctions of API of Tokyo cabinet are reentrant and available in multi\-thread environment.  Discrete database object can be operated in parallel entirely.  For simultaneous operations of the same database object, read\-write lock is used for exclusion control.  That is, while a writing thread is operating the database, other reading threads and writing threads are blocked.  However, while a reading thread is operating the database, reading threads are not blocked..PPThe utility API is also provided.  Such fundamental data structure as list and map are included.  And, some useful features; memory pool, string processing, encoding, are also included..PPFive kinds of API; the utility API, the hash database API, the B+ tree database API, the fixed\-length database API, and the abstract database API, are provided for the C language.  Command line interfaces are also provided corresponding to each API.  They are useful for prototyping, test, and debugging.  Except for C, Tokyo Cabinet provides APIs for Perl, Ruby, Java, and Lua.  The Perl API has methods calling the hash database API, the B+ tree database API, and the fixed\-length database API with XS language.  The Ruby API has methods calling the hash database API, the B+ tree database API, and the fixed length database API as modules of Ruby.  The Java API has native methods calling the hash database API, the B+ tree database API, and the fixed\-length database API with Java Native Interface.  The Lua API has methods calling the hash database API, the B+ tree database API, and the fixed length database API as modules of Lua.  APIs for other languages will hopefully be provided by third party..PPIn cases that multiple processes access a database at the same time or some processes access a database on a remote host, the remote service is useful.  The remote service is composed of a database server and its access library.  Applications can access the database server by using the remote database API.  The server implements HTTP and the memcached protocol partly so that client programs on almost all platforms can access the server easily..SH INSTALLATION.PPThis section describes how to install Tokyo Cabinet with the source package.  As for a binary package, see its installation manual..SH PREPARATION.PPTokyo Cabinet is available on UNIX\-like systems.  At least, the following environments are supported..PP.RSLinux 2.4 and later (x86\-32/x86\-64/PowerPC/Alpha/SPARC).brMac OS X 10.3 and later (x86\-32/x86\-64/PowerPC).br.RE.PP\fBgcc\fR 3.1 or later and \fBmake\fR are required to install Tokyo Cabinet with the source package.  They are installed by default on Linux, FreeBSD and so on..PPAs Tokyo Cabinet depends on the following libraries, install them beforehand..PP.RSzlib : for loss\-less data compression.  1.2.3 or later is suggested..brbzip2 : for loss\-less data compression.  1.0.5 or later is suggested..br.RE.SH INSTALLATION.PPWhen an archive file of Tokyo Cabinet is extracted, change the current working directory to the generated directory and perform installation..PPRun the configuration script..PPBuild programs..PPPerform self\-diagnostic test..PPInstall programs.  This operation must be carried out by the \fBroot\fR user..SH RESULT.PPWhen a series of work finishes, the following files will be installed..SH OPTIONS OF CONFIGURE.PPThe following options can be specified with `\fB./configure\fR'..PP.RS\fB\-\-enable\-debug\fR : build for debugging.  Enable debugging symbols, do not perform optimization, and perform static linking..br\fB\-\-enable\-devel\fR : build for development.  Enable debugging symbols, perform optimization, and perform dynamic linking..br\fB\-\-enable\-profile\fR : build for profiling.  Enable profiling symbols, perform optimization, and perform dynamic linking..br\fB\-\-enable\-static\fR : build by static linking..br\fB\-\-enable\-fastest\fR : build for fastest run..br\fB\-\-enable\-off64\fR : build with 64\-bit file offset on 32\-bit system..br\fB\-\-enable\-swab\fR : build for swapping byte\-orders..br\fB\-\-enable\-uyield\fR : build for detecting race conditions..br\fB\-\-disable\-zlib\fR : build without ZLIB compression..br\fB\-\-disable\-bzip\fR : build without BZIP2 compression..br\fB\-\-disable\-pthread\fR : build without POSIX thread support..br\fB\-\-disable\-shared\fR :  avoid to build shared libraries..br.RE.PP`\fB\-\-prefix\fR' and other options are also available as with usual UNIX software packages.  If you want to install Tokyo Cabinet under `\fB/usr\fR' not `\fB/usr/local\fR', specify `\fB\-\-prefix=/usr\fR'.  As well, the library search path does not include `\fB/usr/local/lib\fR', it is necessary to set the environment variable `\fBLD_LIBRARY_PATH\fR' to include `\fB/usr/local/lib\fR' before running applications of Tokyo Cabinet..SH HOW TO USE THE LIBRARY.PPTokyo Cabinet provides API of the C language and it is available by programs conforming to the C89 (ANSI C) standard or the C99 standard.  As the header files of Tokyo Cabinet are provided as `\fBtcutil.h\fR', `\fBtchdb.h\fR', `\fBtcbdb.h\fR', and `\fBtcadb.h\fR', applications should include one or more of them accordingly to use the API.  As the library is provided as `\fBlibtokyocabinet.a\fR' and `\fBlibtokyocabinet.so\fR' and they depends `\fBlibz.so\fR', `\fBlibpthread.so\fR', `\fBlibm.so\fR', and `\fBlibc.so\fR', linker options `\fB\-ltokyocabinet\fR', `\fB\-lz\fR', `\fB\-lbz2\fR', `\fB\-lpthread\fR', `\fB\-lm\fR', and `\fB\-lc\fR' are required for build command.  A typical build command is the following..PPYou can also use Tokyo Cabinet in programs written in C++.  Because each header is wrapped in C linkage (`\fBextern "C"\fR' block), you can simply include them into your C++ programs..SH THE UTILITY API.PPThe utility API is a set of routines to handle records on memory easily.  Especially, extensible string, array list, hash map, and ordered tree are useful.  See `\fBtcutil.h\fR' for entire specification..SH DESCRIPTION.PPTo use the utility API, include `\fBtcutil.h\fR' and related standard header files.  Usually, write the following description near the front of a source file..PP.RS.br\fB#include <tcutil.h>\fR.br\fB#include <stdlib.h>\fR.br\fB#include <stdbool.h>\fR.br\fB#include <stdint.h>\fR.RE.PPObjects whose type is pointer to `\fBTCXSTR\fR' are used for extensible string.  An extensible string object is created with the function `\fBtcxstrnew\fR' and is deleted with the function `\fBtcxstrdel\fR'.  Objects whose type is pointer to `\fBTCLIST\fR' are used for array list.  A list object is created with the function `\fBtclistnew\fR' and is deleted with the function `\fBtclistdel\fR'.  Objects whose type is pointer to `\fBTCMAP\fR' are used for hash map.  A map object is created with the function `\fBtcmapnew\fR' and is deleted with the function `\fBtcmapdel\fR'.  Objects whose type is pointer to `\fBTCTREE\fR' are used for ordered tree.  A tree object is created with the function `\fBtctreenew\fR' and is deleted with the function `\fBtctreedel\fR'.  To avoid memory leak, it is important to delete every object when it is no longer in use..SH API OF BASIC UTILITIES.PPThe constant `tcversion' is the string containing the version information..PP.RS.br\fBextern const char *tcversion;\fR.RE.PPThe variable `tcfatalfunc' is the pointer to the call back function for handling a fatal error..PP.RS.br\fBextern void (*tcfatalfunc)(const char *);\fR.RSThe argument specifies the error message..RE.RSThe initial value of this variable is `NULL'.  If the value is `NULL', the default function is called when a fatal error occurs.  A fatal error occurs when memory allocation is failed..RE.RE.PPThe function `tcmalloc' is used in order to allocate a region on memory..PP.RS.br\fBvoid *tcmalloc(size_t \fIsize\fB);\fR.RS`\fIsize\fR' specifies the size of the region..RE.RSThe return value is the pointer to the allocated region..RE.RSThis function handles failure of memory allocation implicitly.  Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use..RE.RE.PPThe function `tccalloc' is used in order to allocate a nullified region on memory..PP.RS.br\fBvoid *tccalloc(size_t \fInmemb\fB, size_t \fIsize\fB);\fR.RS`\fInmemb\fR' specifies the number of elements..RE.RS`\fIsize\fR' specifies the size of each element..RE.RSThe return value is the pointer to the allocated nullified region..RE.RSThis function handles failure of memory allocation implicitly.  Because the region of the return value is allocated with the `calloc' call, it should be released with the `free' call when it is no longer in use..RE.RE.PPThe function `tcrealloc' is used in order to re\-allocate a region on memory..PP.RS.br\fBvoid *tcrealloc(void *\fIptr\fB, size_t \fIsize\fB);\fR.RS`\fIptr\fR' specifies the pointer to the region..RE.RS`\fIsize\fR' specifies the size of the region..RE.RSThe return value is the pointer to the re\-allocated region..RE.RSThis function handles failure of memory allocation implicitly.  Because the region of the return value is allocated with the `realloc' call, it should be released with the `free' call when it is no longer in use..RE.RE.PPThe function `tcmemdup' is used in order to duplicate a region on memory..PP.RS.br\fBvoid *tcmemdup(const void *\fIptr\fB, size_t \fIsize\fB);\fR.RS`\fIptr\fR' specifies the pointer to the region..RE.RS`\fIsize\fR' specifies the size of the region..RE.RSThe return value is the pointer to the allocated region of the duplicate..RE.RSBecause an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string.  Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use..RE.RE.PPThe function `tcstrdup' is used in order to duplicate a string on memory..PP.RS.br\fBchar *tcstrdup(const void *\fIstr\fB);\fR.RS`\fIstr\fR' specifies the string..RE.RSThe return value is the allocated string equivalent to the specified string..RE.RSBecause the region of the return value is allocated with the `malloc' call, it should be released with the `free' call when it is no longer in use..RE.RE.PPThe function `tcfree' is used in order to free a region on memory..PP.RS.br\fBvoid tcfree(void *\fIptr\fB);\fR.RS`\fIptr\fR' specifies the pointer to the region.  If it is `NULL', this function has no effect..RE.RSAlthough this function is just a wrapper of `free' call, this is useful in applications using another package of the `malloc' series..RE.RE.SH API OF EXTENSIBLE STRING.PPThe function `tcxstrnew' is used in order to create an extensible string object..PP.RS.br\fBTCXSTR *tcxstrnew(void);\fR.RSThe return value is the new extensible string object..RE.RE.PPThe function `tcxstrnew2' is used in order to create an extensible string object from a character string..PP.RS.br\fBTCXSTR *tcxstrnew2(const char *\fIstr\fB);\fR.RS`\fIstr\fR' specifies the string of the initial content..RE.RSThe return value is the new extensible string object containing the specified string..RE.RE.PPThe function `tcxstrnew3' is used in order to create an extensible string object with the initial allocation size..PP.RS.br\fBTCXSTR *tcxstrnew3(int \fIasiz\fB);\fR.RS`\fIasiz\fR' specifies the initial allocation size..RE.RSThe return value is the new extensible string object..RE.RE.PPThe function `tcxstrdup' is used in order to copy an extensible string object..PP.RS.br\fBTCXSTR *tcxstrdup(const TCXSTR *\fIxstr\fB);\fR.RS`\fIxstr\fR' specifies the extensible string object.
hoge - 源码说明

本页面展示了「高性能嵌入式数据库在高并发的环境下使用最好是64位系统比较好」中的 hoge 源码文件，采用编程语言编写，共 1,892 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与性能相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?