📄 btreeint.h
字号:
/*** 2004 April 6**** The author disclaims copyright to this source code. In place of** a legal notice, here is a blessing:**** May you do good and not evil.** May you find forgiveness for yourself and forgive others.** May you share freely, never taking more than you give.***************************************************************************** $Id: btreeInt.h,v 1.5 2007/06/15 12:06:59 drh Exp $**** This file implements a external (disk-based) database using BTrees.** For a detailed discussion of BTrees, refer to**** Donald E. Knuth, THE ART OF COMPUTER PROGRAMMING, Volume 3:** "Sorting And Searching", pages 473-480. Addison-Wesley** Publishing Company, Reading, Massachusetts.**** The basic idea is that each page of the file contains N database** entries and N+1 pointers to subpages.**** ----------------------------------------------------------------** | Ptr(0) | Key(0) | Ptr(1) | Key(1) | ... | Key(N-1) | Ptr(N) |** ----------------------------------------------------------------**** All of the keys on the page that Ptr(0) points to have values less** than Key(0). All of the keys on page Ptr(1) and its subpages have** values greater than Key(0) and less than Key(1). All of the keys** on Ptr(N) and its subpages have values greater than Key(N-1). And** so forth.**** Finding a particular key requires reading O(log(M)) pages from the ** disk where M is the number of entries in the tree.**** In this implementation, a single file can hold one or more separate ** BTrees. Each BTree is identified by the index of its root page. The** key and data for any entry are combined to form the "payload". A** fixed amount of payload can be carried directly on the database** page. If the payload is larger than the preset amount then surplus** bytes are stored on overflow pages. The payload for an entry** and the preceding pointer are combined to form a "Cell". Each ** page has a small header which contains the Ptr(N) pointer and other** information such as the size of key and data.**** FORMAT DETAILS**** The file is divided into pages. The first page is called page 1,** the second is page 2, and so forth. A page number of zero indicates** "no such page". The page size can be anything between 512 and 65536.** Each page can be either a btree page, a freelist page or an overflow** page.**** The first page is always a btree page. The first 100 bytes of the first** page contain a special header (the "file header") that describes the file.** The format of the file header is as follows:**** OFFSET SIZE DESCRIPTION** 0 16 Header string: "SQLite format 3\000"** 16 2 Page size in bytes. ** 18 1 File format write version** 19 1 File format read version** 20 1 Bytes of unused space at the end of each page** 21 1 Max embedded payload fraction** 22 1 Min embedded payload fraction** 23 1 Min leaf payload fraction** 24 4 File change counter** 28 4 Reserved for future use** 32 4 First freelist page** 36 4 Number of freelist pages in the file** 40 60 15 4-byte meta values passed to higher layers**** All of the integer values are big-endian (most significant byte first).**** The file change counter is incremented when the database is changed** This counter allows other processes to know when the file has changed** and thus when they need to flush their cache.**** The max embedded payload fraction is the amount of the total usable** space in a page that can be consumed by a single cell for standard** B-tree (non-LEAFDATA) tables. A value of 255 means 100%. The default** is to limit the maximum cell size so that at least 4 cells will fit** on one page. Thus the default max embedded payload fraction is 64.**** If the payload for a cell is larger than the max payload, then extra** payload is spilled to overflow pages. Once an overflow page is allocated,** as many bytes as possible are moved into the overflow pages without letting** the cell size drop below the min embedded payload fraction.**** The min leaf payload fraction is like the min embedded payload fraction** except that it applies to leaf nodes in a LEAFDATA tree. The maximum** payload fraction for a LEAFDATA tree is always 100% (or 255) and it** not specified in the header.**** Each btree pages is divided into three sections: The header, the** cell pointer array, and the cell content area. Page 1 also has a 100-byte** file header that occurs before the page header.**** |----------------|** | file header | 100 bytes. Page 1 only.** |----------------|** | page header | 8 bytes for leaves. 12 bytes for interior nodes** |----------------|** | cell pointer | | 2 bytes per cell. Sorted order.** | array | | Grows downward** | | v** |----------------|** | unallocated |** | space |** |----------------| ^ Grows upwards** | cell content | | Arbitrary order interspersed with freeblocks.** | area | | and free space fragments.** |----------------|**** The page headers looks like this:**** OFFSET SIZE DESCRIPTION** 0 1 Flags. 1: intkey, 2: zerodata, 4: leafdata, 8: leaf** 1 2 byte offset to the first freeblock** 3 2 number of cells on this page** 5 2 first byte of the cell content area** 7 1 number of fragmented free bytes** 8 4 Right child (the Ptr(N) value). Omitted on leaves.**** The flags define the format of this btree page. The leaf flag means that** this page has no children. The zerodata flag means that this page carries** only keys and no data. The intkey flag means that the key is a integer** which is stored in the key size entry of the cell header rather than in** the payload area.**** The cell pointer array begins on the first byte after the page header.** The cell pointer array contains zero or more 2-byte numbers which are** offsets from the beginning of the page to the cell content in the cell** content area. The cell pointers occur in sorted order. The system strives** to keep free space after the last cell pointer so that new cells can** be easily added without having to defragment the page.**** Cell content is stored at the very end of the page and grows toward the** beginning of the page.**** Unused space within the cell content area is collected into a linked list of** freeblocks. Each freeblock is at least 4 bytes in size. The byte offset** to the first freeblock is given in the header. Freeblocks occur in** increasing order. Because a freeblock must be at least 4 bytes in size,** any group of 3 or fewer unused bytes in the cell content area cannot** exist on the freeblock chain. A group of 3 or fewer free bytes is called** a fragment. The total number of bytes in all fragments is recorded.** in the page header at offset 7.**** SIZE DESCRIPTION** 2 Byte offset of the next freeblock** 2 Bytes in this freeblock**** Cells are of variable length. Cells are stored in the cell content area at** the end of the page. Pointers to the cells are in the cell pointer array** that immediately follows the page header. Cells is not necessarily** contiguous or in order, but cell pointers are contiguous and in order.**** Cell content makes use of variable length integers. A variable** length integer is 1 to 9 bytes where the lower 7 bits of each ** byte are used. The integer consists of all bytes that have bit 8 set and** the first byte with bit 8 clear. The most significant byte of the integer** appears first. A variable-length integer may not be more than 9 bytes long.** As a special case, all 8 bytes of the 9th byte are used as data. This** allows a 64-bit integer to be encoded in 9 bytes.**** 0x00 becomes 0x00000000** 0x7f becomes 0x0000007f** 0x81 0x00 becomes 0x00000080** 0x82 0x00 becomes 0x00000100** 0x80 0x7f becomes 0x0000007f** 0x8a 0x91 0xd1 0xac 0x78 becomes 0x12345678** 0x81 0x81 0x81 0x81 0x01 becomes 0x10204081**** Variable length integers are used for rowids and to hold the number of** bytes of key and data in a btree cell.**** The content of a cell looks like this:**** SIZE DESCRIPTION** 4 Page number of the left child. Omitted if leaf flag is set.** var Number of bytes of data. Omitted if the zerodata flag is set.** var Number of bytes of key. Or the key itself if intkey flag is set.** * Payload** 4 First page of the overflow chain. Omitted if no overflow**** Overflow pages form a linked list. Each page except the last is completely** filled with data (pagesize - 4 bytes). The last page can have as little** as 1 byte of data.**** SIZE DESCRIPTION** 4 Page number of next overflow page** * Data**** Freelist pages come in two subtypes: trunk pages and leaf pages. The** file header points to the first in a linked list of trunk page. Each trunk** page points to multiple leaf pages. The content of a leaf page is** unspecified. A trunk page looks like this:**** SIZE DESCRIPTION** 4 Page number of next trunk page** 4 Number of leaf pointers on this page** * zero or more pages numbers of leaves*/#include "sqliteInt.h"#include "pager.h"#include "btree.h"#include "os.h"#include <assert.h>/* Round up a number to the next larger multiple of 8. This is used** to force 8-byte alignment on 64-bit architectures.*/#define ROUND8(x) ((x+7)&~7)/* The following value is the maximum cell size assuming a maximum page** size give above.*/#define MX_CELL_SIZE(pBt) (pBt->pageSize-8)/* The maximum number of cells on a single page of the database. This** assumes a minimum cell size of 3 bytes. Such small cells will be** exceedingly rare, but they are possible.*/#define MX_CELL(pBt) ((pBt->pageSize-8)/3)/* Forward declarations */typedef struct MemPage MemPage;typedef struct BtLock BtLock;/*** This is a magic string that appears at the beginning of every** SQLite database in order to identify the file as a real database.**** You can change this value at compile-time by specifying a** -DSQLITE_FILE_HEADER="..." on the compiler command-line. The** header must be exactly 16 bytes including the zero-terminator so** the string itself should be 15 characters long. If you change** the header, then your custom library will not be able to read ** databases generated by the standard tools and the standard tools** will not be able to read databases created by your custom library.*/#ifndef SQLITE_FILE_HEADER /* 123456789 123456 */# define SQLITE_FILE_HEADER "SQLite format 3"#endif/*** Page type flags. An ORed combination of these flags appear as the** first byte of every BTree page.*/#define PTF_INTKEY 0x01#define PTF_ZERODATA 0x02#define PTF_LEAFDATA 0x04#define PTF_LEAF 0x08/*** As each page of the file is loaded into memory, an instance of the following** structure is appended and initialized to zero. This structure stores** information about the page that is decoded from the raw file page.**** The pParent field points back to the parent page. This allows us to** walk up the BTree from any leaf to the root. Care must be taken to** unref() the parent page pointer when this page is no longer referenced.** The pageDestructor() routine handles that chore.*/struct MemPage { u8 isInit; /* True if previously initialized. MUST BE FIRST! */ u8 idxShift; /* True if Cell indices have changed */ u8 nOverflow; /* Number of overflow cell bodies in aCell[] */ u8 intKey; /* True if intkey flag is set */ u8 leaf; /* True if leaf flag is set */ u8 zeroData; /* True if table stores keys only */ u8 leafData; /* True if tables stores data on leaves only */ u8 hasData; /* True if this page stores data */ u8 hdrOffset; /* 100 for page 1. 0 otherwise */ u8 childPtrSize; /* 0 if leaf==1. 4 if leaf==0 */ u16 maxLocal; /* Copy of Btree.maxLocal or Btree.maxLeaf */ u16 minLocal; /* Copy of Btree.minLocal or Btree.minLeaf */ u16 cellOffset; /* Index in aData of first cell pointer */ u16 idxParent; /* Index in parent of this node */ u16 nFree; /* Number of free bytes on the page */ u16 nCell; /* Number of cells on this page, local and ovfl */ struct _OvflCell { /* Cells that will not fit on aData[] */ u8 *pCell; /* Pointers to the body of the overflow cell */ u16 idx; /* Insert this cell before idx-th non-overflow cell */ } aOvfl[5]; BtShared *pBt; /* Pointer back to BTree structure */ u8 *aData; /* Pointer back to the start of the page */ DbPage *pDbPage; /* Pager page handle */ Pgno pgno; /* Page number for this page */ MemPage *pParent; /* The parent of this page. NULL for root */};
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -