⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 structure

📁 linux subdivision ying gai ke yi le ba
💻
📖 第 1 页 / 共 3 页
字号:
Subversion on Berkeley DB                                    -*- text -*-

There are many different ways to implement the Subversion filesystem
interface.  You could implement it directly using ordinary POSIX
filesystem operations; you could build it using an SQL server as a
back end; you could build it on RCS; and so on.

This implementation of the Subversion filesystem interface is built on
top of Berkeley DB (http://www.sleepycat.com).  Berkeley DB supports
transactions and recoverability, making it well-suited for Subversion.



Nodes and Node Revisions

In a Subversion filesystem, a `node' corresponds roughly to an
`inode' in a Unix filesystem:

   * A node is either a file or a directory.

   * A node's contents change over time.

   * When you change a node's contents, it's still the same node; it's
     just been changed.  So a node's identity isn't bound to a specific
     set of contents.

   * If you rename a node, it's still the same node, just under a
     different name.  So a node's identity isn't bound to a particular
     filename.

A `node revision' refers to a node's contents at a specific point in
time.  Changing a node's contents always creates a new revision of that
node.  Once created, a node revision's contents never change.

When we create a node, its initial contents are the initial revision of
the node.  As users make changes to the node over time, we create new
revisions of that same node.  When a user commits a change that deletes
a file from the filesystem, we don't delete the node, or any revision
of it --- those stick around to allow us to recreate prior revisions of
the filesystem.  Instead, we just remove the reference to the node
from the directory.



ID's

Within the database, we refer to nodes and node revisions using a
string of three unique identifiers (the "node ID", the "copy ID", and
the "txn ID"), separated by periods.

    node_revision_id ::= node_id '.' copy_id '.' txn_id

The node ID is unique to a particular node in the filesystem across
all of revision history.  That is, two node revisions who share
revision history (perhaps because they are different revisions of the
same node, or because one is a copy of the other, e.g.) have the same
node ID, whereas two node revisions who have no common revision
history will not have the same node ID.

The copy ID is a key into the `copies' table (see `Copies' below), and
identifies that a given node revision, or one of its ancestors,
resulted from a unique filesystem copy operation.

The txn ID is just an identifier that is unique to a single filesystem
commit.  All node revisions created as part of a commit share this txn
ID (which, incidentally, gets its name from the fact that this id is
the same id used as the primary key of Subversion transactions; see
`Transactions' below).

A directory entry identifies the file or subdirectory it refers to
using a node revision ID --- not a node ID.  This means that a change
to a file far down in a directory hierarchy requires the parent
directory of the changed node to be updated, to hold the new node
revision ID.  Now, since that parent directory has changed, its parent
needs to be updated, and so on to the root.  We call this process
"bubble-up".

If a particular subtree was unaffected by a given commit, the node
revision ID that appears in its parent will be unchanged.  When
doing an update, we can notice this, and ignore that entire
subtree.  This makes it efficient to find localized changes in
large trees.



A Word About Keys

Some of the Subversion database tables use base-36 numbers as their
keys.  Some debate exists about whether the use of base-36 (as opposed
to, say, regular decimal values) is either necessary or good.  It is
outside the scope of this document to make a claim for or against this
usage.  As such, the reader will please note that for the majority of
the document, the use of the term "number" when referring to keys of
database tables should be interpreted to mean "a monotonically
increasing unique key whose order with respect to other keys in the
table is irrelevant".  :-)

To determine the actual type currently in use for the keys of a given
table, you are invited to check out the "Appendix: Filesystem
structure summary" section of this document.



NODE-REVISION and HEADER: how we represent a node revision

We represent a given revision of a file or directory node using a list
skel (see skel.h for an explanation of skels).  A node revision skel
has the form:

    (HEADER PROP-KEY KIND-SPECIFIC ...)

where HEADER is a header skel, whose structure is common to all nodes,
PROP-KEY is the key of the representation that contains this node's
properties list, and the KIND-SPECIFIC elements carry data dependent
on what kind of node this is --- file, directory, etc.

HEADER has the form:

    (KIND CREATED-PATH PRED-ID PRED-COUNT)

where:

   * KIND indicates what sort of node this is.  It must be one of the
     following:
       - "file", indicating that the node is a file (see FILE below).
       - "dir", indicating that the node is a directory (see DIR below).

   * CREATED-PATH is the canonicalized absolute filesystem path at
     which this node was created.

   * PRED-ID, if present, indicates the node revision which is the
     immediate ancestor of this node.

   * PRED-COUNT, if present, indicates the number of predecessors the
     node revision has (recursively).

Note that a node cannot change its kind from one revision to the next.
A directory node is always a directory; a file node is always a file;
etc.  The fact that the node's kind is stored in each node revision,
rather than in some revision-independent place, might suggest that
it's possible for a node change kinds from revision to revision, but
Subversion does not allow this.

PROP-KEY is a key into the `representations' table (see REPRESENTATIONS 
below), whose value is a representation pointing to a string 
(see `strings' table) that is a PROPLIST skel.

The KIND-SPECIFIC portions are discussed below.



PROPLIST: a property list is a list skel of the form:

    (NAME1 VALUE1 NAME2 VALUE2 ...)

where each NAMEi is the name of a property, and VALUEi is the value of
the property named NAMEi.  Every valid property list has an even
number of elements.



FILE: how files are represented.

If a NODE-REVISION's header's KIND is "file", then the node-revision
skel represents a file, and has the form:

    (HEADER PROP-KEY DATA-KEY [EDIT-DATA-KEY])

where DATA-KEY identifies the representation for the file's current
contents, and EDIT-DATA-KEY identifies the a representation currently
available for receiving new contents for the file.

See discussion of representations later.



DIR: how directories are represented.

If the header's KIND is "dir", then the node-revision skel
represents a directory, and has the form:

    (HEADER PROP-KEY ENTRIES-KEY)

where ENTRIES-KEY identifies the representation for the directory's
entries list (see discussion of representations later).  An entries
list has the form

    (ENTRY ...)

where each entry is

    (NAME ID)

where:

   * NAME is the name of the directory entry, in UTF-8, and

   * ID is the ID of the node revision to which this entry refers



REPRESENTATIONS: where and how Subversion stores your data.

Some parts of a node revision are essentially constant-length: for
example, the KIND field and the REV.  Other parts can have
arbitrarily varying length: property lists, file contents, and
directory entry lists.  This variable-length data is often similar
from one revision to the next, so Subversion stores just the deltas
between them, instead of successive fulltexts.

The HEADER portion of a node revision holds the constant-length stuff,
which is never deltified.  The rest of a node revision just points to
data stored outside the node revision proper.  This design makes the
repository code easier to maintain, because deltification and
undeltification are confined to a layer separate from node revisions,
and makes the code more efficient, because Subversion can retrieve
just the parts of a node it needs for a given operation.

Deltifiable data is stored in the `strings' table, as mediated by the
`representations' table.  Here's how it works:

The `strings' table stores only raw bytes.  A given string could be
any one of these:

   - a file's contents
   - a delta that reconstructs file contents, or part of a file's contents
   - a directory entry list skel
   - a delta that reconstructs a dir entry list skel, or part of same
   - a property list skel
   - a delta that reconstructs a property list skel, or part of same

There is no way to tell, from looking at a string, what kind of data
it is.  A directory entry list skel is indistinguishable from file
contents that just happen to look exactly like the unparsed form of a
directory entry list skel.  File contents that just happen to look
like svndiff data are indistinguishable from delta data.

The code is able to interpret a given string because Subversion

   a) knows whether to be looking for a property list or some
      kind-specific data,

   b) knows the `kind' of the node revision in question,

   c) always goes through the `representations' table to discover if
      any undeltification or other transformation is needed.

The `representations' table is an intermediary between node revisions
and strings.  Node revisions never refer directly into the `strings'
table; instead, they always refer into the `representations' table,
which knows whether a given string is a fulltext or a delta, and if it
is a delta, what it is a delta against.  That, combined with the
knowledge in (a) and (b) above, allows Subversion to retrieve the data
and parse it appropriately.  A representation has the form:

   (HEADER KIND-SPECIFIC)

where HEADER is

   (KIND TXN [CHECKSUM])

The KIND is "fulltext" or "delta".  TXN is the txn ID for the txn in
which this representation was created.  CHECKSUM is a checksum of the
representation's contents, that is, what the representation produces,
regardless of whether it is stored deltified or as fulltext.  (For
compatibility with older versions of Subversion, CHECKSUM may be
absent, in which case the filesystem behaves as though the checksum is
there and is correct.)

The TXN also serves as a kind of mutability flag: if txn T tries to
change a representation's contents, but the rep's TXN is not T, then
something has gone horribly wrong and T should leave the rep alone
(and probably error).  Of course, "change a representation" here means
changing what the rep's consumer sees.  Switching a representation's
storage strategy, for example from fulltext to deltified, wouldn't
count as a change, since that wouldn't affect what the rep produces.

KIND-SPECIFIC varies considerably depending on the kind of
representation.  Here are the two forms currently recognized:

   (("fulltext" TXN CHECKSUM) KEY)
       The data is at KEY in the `strings' table.

   (("delta" TXN CHECKSUM) (OFFSET WINDOW) ...)
       Each OFFSET indicates the point in the fulltext that this
       element reconstructs, and WINDOW says how to reconstruct it:

       WINDOW ::= (DIFF SIZE REP-KEY [REP-OFFSET]) ;
       DIFF   ::= ("svndiff" VERSION STRING-KEY)

       Notice that a WINDOW holds only metadata.  REP-KEY says what
       the window should be applied against, or none if this is a
       self-compressed delta; SIZE says how much data this window
       reconstructs; VERSION says what version of the svndiff format
       is being used (currently only version 0 is supported); and
       STRING-KEY says which string contains the actual svndiff data

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -