⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 fileformat.tcl

📁 sqlite-3.4.1,嵌入式数据库.是一个功能强大的开源数据库,给学习和研发以及小型公司的发展带来了全所未有的好处.
💻 TCL
📖 第 1 页 / 共 2 页
字号:
## Run this script to generated a fileformat.html output file#set rcsid {$Id: fileformat.tcl,v 1.13 2004/10/10 17:24:55 drh Exp $}source common.tclheader {SQLite Database File Format (Version 2)}puts {<h2>SQLite 2.X Database File Format</h2><p>This document describes the disk file format for SQLite versions 2.1through 2.8.  SQLite version 3.0 and following uses a very differentformat which is described separately.</p><h3>1.0 &nbsp; Layers</h3><p>SQLite is implemented in layers.(See the <a href="arch.html">architecture description</a>.)The format of database files is determined by three differentlayers in the architecture.</p><ul><li>The <b>schema</b> layer implemented by the VDBE.</li><li>The <b>b-tree</b> layer implemented by btree.c</li><li>The <b>pager</b> layer implemented by pager.c</li></ul><p>We will describe each layer beginning with the bottom (pager)layer and working upwards.</p><h3>2.0 &nbsp; The Pager Layer</h3><p>An SQLite database consists of"pages" of data.  Each page is 1024 bytes in size.Pages are numbered beginning with 1.A page number of 0 is used to indicate "no such page" in theB-Tree and Schema layers.</p><p>The pager layer is responsible for implementing transactionswith atomic commit and rollback.  It does this using a separatejournal file.  Whenever a new transaction is started, a journalfile is created that records the original state of the database.If the program terminates before completing the transaction, the nextprocess to open the database can use the journal file to restorethe database to its original state.</p><p>The journal file is located in the same directory as the databasefile and has the same name as the database file but with thecharacters "<tt>-journal</tt>" appended.</p><p>The pager layer does not impose any content restrictions on themain database file.  As far as the pager is concerned, each pagecontains 1024 bytes of arbitrary data.  But there is structure tothe journal file.</p><p>A journal file begins with 8 bytes as follows:0xd9, 0xd5, 0x05, 0xf9, 0x20, 0xa1, 0x63, and 0xd6.Processes that are attempting to rollback a journal use these 8 bytesas a sanity check to make sure the file they think is a journal reallyis a valid journal.  Prior version of SQLite used different journalfile formats.  The magic numbers for these prior formats are differentso that if a new version of the library attempts to rollback a journalcreated by an earlier version, it can detect that the journal usesan obsolete format and make the necessary adjustments.  This articledescribes only the newest journal format - supported as of version2.8.0.</p><p>Following the 8 byte prefix is a three 4-byte integers that tell usthe number of pages that have been committed to the journal,a magic number used forsanity checking each page, and theoriginal size of the main database file before the transaction wasstarted.  The number of committed pages is used to limit how farinto the journal to read.  The use of the checksum magic number isdescribed below.The original size of the database is used to restore the databasefile back to its original size.The size is expressed in pages (1024 bytes per page).</p><p>All three integers in the journal header and all other multi-bytenumbers used in the journal file are big-endian.That means that the most significant byteoccurs first.  That way, a journal file that isoriginally created on one machine can be rolled back by anothermachine that uses a different byte order.  So, for example, atransaction that failed to complete on your big-endian SparcStationcan still be rolled back on your little-endian Linux box.</p><p>After the 8-byte prefix and the three 4-byte integers, thejournal file consists of zero or more page records.  Each pagerecord is a 4-byte (big-endian) page number followed by 1024 bytesof data and a 4-byte checksum.  The data is the original content of the database pagebefore the transaction was started.  So to roll back the transaction,the data is simply written into the corresponding page of themain database file.  Pages can appear in the journal in any order,but they are guaranteed to appear only once. All page numbers will bebetween 1 and the maximum specified by the page size integer thatappeared at the beginning of the journal.</p><p>The so-called checksum at the end of each record is not really achecksum - it is the sum of the page number and the magic number whichwas the second integer in the journal header.  The purpose of thisvalue is to try to detect journal corruption that might have occurredbecause of a power loss or OS crash that occurred which the journalfile was being written to disk.  It could have been the case that themeta-data for the journal file, specifically the size of the file, hadbeen written to the disk so that when the machine reboots it appears thatfile is large enough to hold the current record.  But even though thefile size has changed, the data for the file might not have made it tothe disk surface at the time of the OS crash or power loss.  This meansthat after reboot, the end of the journal file will contain quasi-randomgarbage data.  The checksum is an attempt to detect such corruption.  Ifthe checksum does not match, that page of the journal is not rolled back.</p><p>Here is a summary of the journal file format:</p><ul><li>8 byte prefix: 0xd9, 0xd5, 0x05, 0xf9, 0x20, 0xa1, 0x63, 0xd6</li><li>4 byte number of records in journal</li><li>4 byte magic number used for page checksums</li><li>4 byte initial database page count</li><li>Zero or more instances of the following:   <ul>   <li>4 byte page number</li>   <li>1024 bytes of original data for the page</li>   <li>4 byte checksum</li>   </ul></li></ul><h3>3.0 &nbsp; The B-Tree Layer</h3><p>The B-Tree layer builds on top of the pager layer to implementone or more separate b-trees all in the same disk file.  Thealgorithms used are taken from Knuth's <i>The Art Of ComputerProgramming.</i></p><p>Page 1 of a database contains a header string used for sanitychecking, a few 32-bit words of configuration data, and a pointerto the beginning of a list of unused pages in the database.All other pages in thedatabase are either pages of a b-tree, overflow pages, or unusedpages on the freelist.</p><p>Each b-tree page contains zero or more database entries.Each entry has an unique key of one or more bytes and data ofzero or more bytes.Both the key and data are arbitrary byte sequences.  The combinationof key and data are collectively known as "payload".  The currentimplementation limits the amount of payload in a single entry to1048576 bytes.  This limit can be raised to 16777216 by adjustinga single #define in the source code and recompiling.  But most entriescontain less than a hundred bytes of payload so a megabyte limit seemsmore than enough.</p><p>Up to 238 bytes of payload for an entry can be held directly ona b-tree page.  Any additional payload is contained on a linked listof overflow pages.  This limit on the amount of payload held directlyon b-tree pages guarantees that each b-tree page can hold at least4 entries.  In practice, most entries are smaller than 238 bytes andthus most pages can hold more than 4 entries.</p><p>A single database file can hold any number of separate, independent b-trees.Each b-tree is identified by its root page, which never changes.Child pages of the b-tree may change as entries are added and removedand pages split and combine.  But the root page always stays the same.The b-tree itself does not record which pages are root pages and whichare not.  That information is handled entirely at the schema layer.</p><h4>3.1 &nbsp; B-Tree Page 1 Details</h4><p>Page 1 begins with the following 48-byte string:</p><blockquote><pre>** This file contains an SQLite 2.1 database **</pre></blockquote><p>If you count the number of characters in the string above, you willsee that there are only 47.  A '\000' terminator byte is added tobring the total to 48.</p><p>A frequent question is why the string says version 2.1 when (asof this writing) we are up to version 2.7.0 of SQLite and anychange to the second digit of the version is suppose to representa database format change.  The answer to this is that the B-treelayer has not changed any since version 2.1.  There have beendatabase format changes since version 2.1 but those changes haveall been in the schema layer.  Because the format of the b-treelayer is unchanged since version 2.1.0, the header string stillsays version 2.1.</p><p>After the format string is a 4-byte integer used to determine thebyte-order of the database.  The integer has a value of0xdae37528.  If this number is expressed as 0xda, 0xe3, 0x75, 0x28, thenthe database is in a big-endian format and all 16 and 32-bit integerselsewhere in the b-tree layer are also big-endian.  If the number isexpressed as 0x28, 0x75, 0xe3, and 0xda, then the database is in alittle-endian format and all other multi-byte numbers in the b-tree layer are also little-endian.  Prior to version 2.6.3, the SQLite engine was only able to read databasesthat used the same byte order as the processor they were running on.But beginning with 2.6.3, SQLite can read or write databases in anybyte order.</p><p>After the byte-order code are six 4-byte integers.  Each integer is in thebyte order determined by the byte-order code.  The first integer is thepage number for the first page of the freelist.  If there are no unusedpages in the database, then this integer is 0.  The second integer isthe number of unused pages in the database.  The last 4 integers arenot used by the b-tree layer.  These are the so-called "meta" values thatare passed up to the schema layerand used there for configuration and format version information.All bytes of page 1 past beyond the meta-value integers are unused and are initialized to zero.</p><p>Here is a summary of the information contained on page 1 in the b-tree layer:</p><ul><li>48 byte header string</li><li>4 byte integer used to determine the byte-order</li><li>4 byte integer which is the first page of the freelist</li><li>4 byte integer which is the number of pages on the freelist</li><li>36 bytes of meta-data arranged as nine 4-byte integers</li><li>928 bytes of unused space</li></ul><h4>3.2 &nbsp; Structure Of A Single B-Tree Page</h4><p>Conceptually, a b-tree page contains N database entries and N+1 pointersto other b-tree pages.</p><blockquote><table border=1 cellspacing=0 cellpadding=5><tr><td align="center">Ptr<br>0</td><td align="center">Entry<br>0</td><td align="center">Ptr<br>1</td><td align="center">Entry<br>1</td><td align="center"><b>...</b></td><td align="center">Ptr<br>N-1</td><td align="center">Entry<br>N-1</td><td align="center">Ptr<br>N</td></tr></table></blockquote><p>The entries are arranged in increasing order.  That is, the key toEntry 0 is less than the key to Entry 1, and the key to Entry 1 isless than the key of Entry 2, and so forth.  The pointers point topages containing additional entries that have keys in between theentries on either side.  So Ptr 0 points to another b-tree page thatcontains entries that all have keys less than Key 0, and Ptr 1points to a b-tree pages where all entries have keys greater than Key 0but less than Key 1, and so forth.</p><p>Each b-tree page in SQLite consists of a header, zero or more "cells"each holding a single entry and pointer, and zero or more "free blocks"that represent unused space on the page.</p><p>The header on a b-tree page is the first 8 bytes of the page.The header contains the valueof the right-most pointer (Ptr N) and the byte offset into the pageof the first cell and the first free block.  The pointer is a 32-bitvalue and the offsets are each 16-bit values.  We have:</p><blockquote><table border=1 cellspacing=0 cellpadding=5><tr><td align="center" width=30>0</td><td align="center" width=30>1</td><td align="center" width=30>2</td><td align="center" width=30>3</td><td align="center" width=30>4</td><td align="center" width=30>5</td><td align="center" width=30>6</td><td align="center" width=30>7</td></tr><tr><td align="center" colspan=4>Ptr N</td><td align="center" colspan=2>Cell 0</td><td align="center" colspan=2>Freeblock 0</td></tr></table></blockquote><p>The 1016 bytes of a b-tree page that come after the header containcells and freeblocks.  All 1016 bytes are covered by either a cellor a freeblock.</p><p>The cells are connected in a linked list.  Cell 0 contains Ptr 0 andEntry 0.  Bytes 4 and 5 of the header point to Cell 0.  Cell 0 thenpoints to Cell 1 which contains Ptr 1 and Entry 1.  And so forth.Cells vary in size.  Every cell has a 12-byte header and at least 4bytes of payload space.  Space is allocated to payload in incrementsof 4 bytes.  Thus the minimum size of a cell is 16 bytes and up to63 cells can fit on a single page.  The size of a cell is always a multipleof 4 bytes.A cell can have up to 238 bytes of payload space.  Ifthe payload is more than 238 bytes, then an additional 4 byte pagenumber is appended to the cell which is the page number of the firstoverflow page containing the additional payload.  The maximum sizeof a cell is thus 254 bytes, meaning that a least 4 cells can fit intothe 1016 bytes of space available on a b-tree page.An average cell is usually around 52 to 100 bytes in size with about10 or 20 cells to a page.</p><p>The data layout of a cell looks like this:</p><blockquote><table border=1 cellspacing=0 cellpadding=5><tr><td align="center" width=20>0</td><td align="center" width=20>1</td><td align="center" width=20>2</td><td align="center" width=20>3</td><td align="center" width=20>4</td><td align="center" width=20>5</td><td align="center" width=20>6</td><td align="center" width=20>7</td><td align="center" width=20>8</td><td align="center" width=20>9</td><td align="center" width=20>10</td><td align="center" width=20>11</td><td align="center" width=100>12 ... 249</td><td align="center" width=20>250</td><td align="center" width=20>251</td><td align="center" width=20>252</td><td align="center" width=20>253</td></tr><tr><td align="center" colspan=4>Ptr</td><td align="center" colspan=2>Keysize<br>(low)</td>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -