📄 structure
字号:
because of the node revision's "file" type, to interpret the result asfile contents, not as a directory entry list.(Note that the `strings' table stores multiple DB values per key.That is, although it's accurate to say there is one string per key,the string may be divided into multiple consecutive blocks, allsharing that key. You use a Berkeley DB cursor to find the desiredvalue[s], when retrieving a particular offset+len in a string.)Representations know nothing about ancestry -- the `representations'table never refers to node revision id's, only to strings or to otherrepresentations. In other words, while the `nodes' table allowsrecovery of ancestry information, the `representations' and `strings'tables together handle deltification and undeltification*independently* of ancestry. At present, Subversion generally storesthe youngest strings in "fulltext" form, and older strings as "delta"sagainst them (unless the delta would save no space compared to thefulltext). However, there's nothing magic about that particulararrangement. Other interesting alternatives: * We could store the N most recently accessed strings as fulltexts, letting access patterns determine the most appropriate representation for each revision. * We could occasionally store deltas against the N'th younger revision, storing larger jumps with a frequency inverse to the distance covered, yielding a tree-structured history.Since the filesystem interface doesn't expose these details, we canchange the representation pretty much as we please to optimizewhatever parameter we care about --- storage size, speed, robustness,etc.Representations never share strings. Every string is represented byexactly one representation; every representation represents exactlyone string. This is so we can replace a string with deltified versionof itself, change the representation referring to it, and know thatwe're not messing up any other reps by doing so.Further Notes On Deltifying:----------------------------When a representation is deltified, it is changed in place, along withits underlying string. That is, the node revision referring to thatrepresentation will not be changed; instead, the same rep key will nowbe associated with different value. That way, we get reader lockingfor free: if someone is reading a file while Subversion is deltifyingthat file, one of the two sides will get a DB_DEADLOCK andsvn_fs__retry_txn() will retry.### todo: add a note about cycle-checking here, too.The Berkeley DB "nodes" tableThe database contains a table called "nodes", which is a btree indexedby node revision ID's, mapping them onto REPRESENTATION skels. Node 0is always the root directory, and node revision ID 0.0.0 is always theempty directory. We use the value of the key 'next-key' to indicatethe next unused node ID.Assuming that we store the most recent revision on every branch asfulltext, and all other revisions as deltas, we can retrieve any noderevision by searching for the last revision of the node, and thenwalking backwards to specific revision we desire, applying deltas aswe go.REVISION: filesystem revisions, and the Berkeley DB "revisions" tableWe represent a filesystem revision using a skel of the form: ("revision" TXN)where TXN is the key into the `transactions' table (see 'Transactions' below)whose value is the transaction that was committed to create this revision.The database contains a table called "revisions", which is arecord-number table mapping revision numbers onto REVISION skels.Since Berkeley DB record numbers start with 1, whereas Subversionfilesystem revision numbers start at zero, revision V is stored asrecord number V+1 in the `revisions' table. Filesystem revision zeroalways has node revision 0.0.0 as its root directory; that noderevision is guaranteed to be an empty directory.TransactionsEvery transaction ends when it is either successfully committed, oraborted. We call a transaction which has been either committed oraborted "finished", and one which hasn't "unfinished". Transactions are identified by unique numbers, called transactionID's. Currently, transaction ID's are never reused, though this isnot mandated by the schema. In the database, we always represent atransaction ID in its shortest ASCII form.The Berkeley DB `transactions' table records both unfinished andcommitted transactions. Every key in this table is a transaction ID.Unfinished transactions have values that are skels of one of thefollowing forms: ("transaction" ROOT-ID BASE-ID PROPLIST COPIES) ("dead" ROOT-ID BASE-ID PROPLIST COPIES)where: * ROOT-ID is the node revision ID of the transaction's root directory. * BASE-ID is the node revision ID of the root of the transaction's base revision. * PROPLIST is a skel giving the revision properties for the transaction. * COPIES contains a list of keys into the `copies' table, referencing all the filesystem copies created inside of this transaction. If the transaction is aborted, these copies get removed from the `copies' table. * A "dead" transaction is one that has been requested to be destroyed, and should never, ever, be committed.Committed transaction, however, have values that are skels of the form: ("committed" ROOT-ID REV PROPLIST COPIES)where: * ROOT-ID is the node revision ID of the committed transaction's (or revision's) root node. * REVISION represents the revision that was created when the transaction was committed. * PROPLIST is a skel giving the revision properties for the committed transaction. * COPIES contains a list of keys into the `copies' table, referencing all the filesystem copies created by this committed transaction. Nothing currently uses this information for committed transactions, but it could be useful in the future.As the sole exception to the rule above, the `transactions' tablealways has one entry whose key is `next-key', and whose value is thelowest transaction ID that has never yet been used. We use this entryto allocate ID's for new transactions.The `transactions' table is a btree, with no particular sort order.ChangesAs modifications are made (files and dirs added or removed, text andproperties changed, etc.) on Subversion transaction trees, thefilesystem tracks the basic change made in the Berkeley DB `changes'table. The `changes' table is a btree with Berkeley's "duplicate keys"functionality (and with no particular sort order), and maps theone-to-many relationship of a transaction ID to a "change" item.Change items are skels of the form: ("change" PATH ID CHANGE-KIND TEXT-MOD PROP-MOD)where: * PATH is the path that was operated on to enact this change. * ID is the node revision ID of the node changed (may be a zero atom, but only in the "reset" kind case). * CHANGE-KIND is one of the following: - "add" : PATH/ID was added to the filesystem. - "delete" : PATH/ID was removed from the filesystem. - "replace" : PATH/ID was removed, then re-added to the filesystem. - "modify" : PATH/ID was otherwise modified. - "reset" : Ignore any previous changes for PATH/ID in this txn. * TEXT-MOD is a bit specifying whether or not the contents of this node was modified. * PROP-MOD is a bit specifying whether or not the properties of this node where modified.In order to fully describe the changes made to any given path as partof a single transaction, one must read all the change items associatedwith the transaction's ID, and "collapse" multiple entries that referto that path. CopiesEach time a filesystem copy operation is performed, Subversion recordsmeta-data about that copy. Copies are identified by unique numbers called copy ID's. Currently,copy ID's are never reused, though this is not mandated by the schema.In the database, we always represent a copy ID in its shortest ASCIIform.The Berkeley DB `copies' table records all filesystem copies. Everykey in this table is copy ID, and every value is a skel of one of thefollowing forms: ("copy" SRC-PATH SRC-TXN DST-NODE-ID) ("soft-copy" SRC-PATH SRC-TXN DST-NODE-ID)where: * SRC-PATH and SRC-TXN are the canonicalized absolute path and transaction ID, respectively, of the source of the copy. * DST-NODE-ID represents the new node revision created as a result of the copy.As the sole exception to the rule above, the `copies' table always hasone entry whose key is `next-key', and whose value is the lowest copy IDthat has never yet been used. We use this entry to allocate newcopy ID's.The `copies' table is a btree, with no particular sort order.LocksWhen a caller locks a file -- reserving an exclusive right to modifyor delete it -- an lock object is created in a `locks' table.The `locks' table is a btree whose key is a UUID string (also known asa "lock-token"), and whose value is a skel representing a lock. Thefields in the skel mirror those of an svn_lock__t (see svn_types.h): ("lock" PATH UUID OWNER COMMENT XML-P CREATION-DATE EXPIRATION-DATE)where: * PATH is the absolute filesystem path reserved by the lock. * UUID is the universally unique identifier of the lock, also known as the lock-token. This is the same as the row's key. * OWNER is the authenticated username that "owns" the lock. * COMMENT is a string describing the lock. It may be empty, or it might describe the rationale for locking. * XML-P is a boolean (either 0 or 1) indicating whether the COMMENT field is wrapped in an XML tag. (This is something only used by the DAV layer, for webdav interoperabliity.) * CREATION-DATE is a string representation of the date/time when the lock was created. (see svn_time_to_cstring()) * EXPIRATION-DATE is a string representation of the date/time when the lock will cease to be valid. (see svn_time_to_cstring())In addition to creating a lock in the `locks' table, a new row iscreated in a `lock-tokens' table. The `lock-tokens' table is a btreewhose key is an absolute path in the filesystem. The value of eachkey is a lock-token (which is a key into the `locks' table.)To test if a path is locked, simply check if the path is a key in the`lock-tokens' table. To see if a certain directory has any lockedchildren below, we ask BerkeleyDB to do a "greater or equal match" onthe directory path, and see if any results come back. If they do,then at least one of the directory's children is locked, and thus thedirectory cannot be deleted without further investigation.Locks are ephemeral things, not historied in any way. They arepotentially created and deleted quite often. When a lock isdestroyed, the appropriate row is removed from the `locks' table.Additionally, the locked-path is removed from the `lock-tokens' table.Merge rulesThe Subversion filesystem must provide the following characteristics:- clients can submit arbitrary rearrangements of the tree, to be performed as atomic changes to the filesystem tree- multiple clients can submit non-overlapping changes at the same time, without blocking- readers must never block other readers or writers- writers must never block readers- writers may block writersMerging rules: The general principle: a series of changes can be merged iff the final outcome is independent of the order you apply them in.Merging algorithm: For each entry NAME in the directory ANCESTOR: Let ANCESTOR-ENTRY, SOURCE-ENTRY, and TARGET-ENTRY be the IDs of the name within ANCESTOR, SOURCE, and TARGET respectively. (Possibly null if NAME does not exist in SOURCE or TARGET.) If ANCESTOR-ENTRY == SOURCE-ENTRY, then: No changes were made to this entry while the transaction was in progress, so do nothing to the target. Else if ANCESTOR-ENTRY == TARGET-ENTRY, then: A change was made to this entry while the transaction was in process, but the transaction did not touch this entry. Replace TARGET-ENTRY with SOURCE-ENTRY. Else: Changes were made to this entry both within the transaction and to the repository while the transaction was in progress. They must be merged or declared to be in conflict. If SOURCE-ENTRY and TARGET-ENTRY are both null, that's a double delete; flag a conflict. If any of the three entries is of type file, declare a conflict. If either SOURCE-ENTRY or TARGET-ENTRY is not a direct modification of ANCESTOR-ENTRY (determine by comparing the node-id fields), declare a conflict. A replacement is incompatible with a modification or other replacement--even an identical replacement. Direct modifications were made to the directory ANCESTOR-ENTRY in both SOURCE and TARGET. Recursively merge these modifications.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -