📄 gdk_tm.mx
字号:
@' The contents of this file are subject to the MonetDB Public License@' Version 1.1 (the "License"); you may not use this file except in@' compliance with the License. You may obtain a copy of the License at@' http://monetdb.cwi.nl/Legal/MonetDBLicense-1.1.html@'@' Software distributed under the License is distributed on an "AS IS"@' basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the@' License for the specific language governing rights and limitations@' under the License.@'@' The Original Code is the MonetDB Database System.@'@' The Initial Developer of the Original Code is CWI.@' Portions created by CWI are Copyright (C) 1997-2007 CWI.@' All Rights Reserved.@f gdk_tm@a M. L. Kersten, P. Boncz, N. J. Nes@* Transaction managementThe Transaction Manager maintains the buffer of (permanent) BATS held resident.Entries from the BAT buffer are always accessed by BAT id.A BAT becomes permanent by assigning a name with @%BATrename@.Access to the transaction table is regulated by a semaphore.@{@h#ifndef _GDK_TM_H_#define _GDK_TM_H_#include "gdk.h"#define SYSTRANSACTION "tmp"#define MAXTM 10#endif /* _GDK_TM_H_ */@c#include "monetdb_config.h"#include "gdk.h"#include "gdk_tm.h"@- The physical (disk) commit protocol is handled mostly by BBPsync. Once a commitsucceeded, there is the task of removing ex-persistent bats (those that stillwere persistent in the previous commit, but were made transient in this transaction).Notice that such ex- (i.e. non-) persistent bats are not backed up by the BBPsyncprotocol, so we cannot start deleting after we know the commit will succeed.Another hairy issue are the delta statuses in BATs. These provide a fast wayto perform a transaction abort (HOT-abort, instead of COLD-abort, which is achievedby the BBP recovery in a database restart). Hot-abort functionality has not been important in MonetDB for now, so it is not well-tested. The problem here is that if a commit fails in the physical part (BBPsync), we have not sufficient information to roll back the delta statuses. So a 'feature' of the abort is that after a failed commit, in-memory we*will* commit the transaction. Subsequent commits can retry to achieve a physicalcommit. The only way to abort in such a situation is COLD-abort: quit the server and restart, so you get the recovered disk images.@{@c/* in the commit prelude, the delta status in the memory image of all bats is commited */static int prelude(int cnt, bat* subcommit) { int i = 0; while(++i < cnt) { bat bid = subcommit?subcommit[i]:i; if (BBP_status(bid) & BBPPERSISTENT) { BAT *b = BBP_cache(bid); if (b == NULL && (BBP_status(bid) & BBPSWAPPED)) { b = BBPquickdesc(bid, TRUE); if (b == NULL) return -1; } if (b) { BATcommit(b); } } } return 0;}/* in the commit epilogue, the BBP-status of the bats is changed to reflect their presence in the succeeded checkpoint. * Also bats from the previous checkpoint that were deleted now are physically destroyed. */static int epilogue(int cnt, bat* subcommit) { int i = 0; while(++i < cnt) { bat bid = subcommit?subcommit[i]:i; if (BBP_status(bid) & BBPPERSISTENT) { BBP_status_on(bid, BBPEXISTING, subcommit?"TMsubcommit":"TMcommit"); } else if ((BBP_status(bid) & (BBPDELETED | BBPTMP)) && BBP_refs(bid) <= 0 && BBP_lrefs(bid) <= 0) { BAT *b = BBPquickdesc(bid, TRUE); /* the unloaded ones are deleted without loading delete disk images */ BATdelete(b); if (BBP_cache(bid)) { /* those that quickdesc decides to load => free memory */ BATfree(b); } BBPclear(bid); /* clear with locking */ } BBP_status_off(bid, BBPDELETED | BBPSWAPPED | BBPNEW, subcommit?"TMsubcommit":"TMcommit"); } return 0;}@- TMcommitglobal commit without any multi-threaded access assumptions, thus taking all BBP locks. It creates a new database checkpoint.@cintTMcommit(void) { int ret = -1; /* commit with the BBP globally locked */ BBPlock("TMcommit"); if (prelude(BBPsize, NULL) == 0 && BBPsync(BBPsize, NULL) == 0) { ret = epilogue(BBPsize, NULL); } BBPunlock("TMcommit"); return ret;}@- TMsubcommitCreate a new checkpoint that is equal to the previous, with the expection thatfor the passed list of batnames, the current state will be reflected in the new checkpoint.On the bats in this list we assume exclusive access during the operation.This operation is useful for e.g. adding a new XQuery document or SQLtable to the committed state (after bulk-load). Or for dropping a table or doc,without forcing the total database to be clean, which may require a lot of I/O.We expect the globally locked phase (BBPsync) to take little time (<100ms)as only the BBP.dir is written out; and for the existing bats that weremodified, only some heap moves are done (moved from BAKDIR to SUBDIR).The atomic commit for sub-commit is the rename of SUBDIR to DELDIR.As it does not take the BBP-locks (thanks to the assumption that accessis exclusive), the concurrency impact of subcommit is also much lighter to ongoing concurrent query and update facilities than a real global TMcommit.@cintTMsubcommit(BAT* b) { int xx, ret = -1, cnt = 1; /* BBP artifact: slot 0 in the array will be ignored */ bat *subcommit = (bat*) alloca((BATcount(b)+1)*sizeof(bat)); BUN p, q; /* collect the list and save the new bats outside any locking */ BATloopFast(b, p, q, xx) { bat bid = BBPindex((str) BUNtail(b,p)); if (bid < 0) bid = -bid; if (bid) subcommit[cnt++] = bid; } /* sort the list on BAT id */ GDKqsort(subcommit+1, NULL, cnt-1, sizeof(bat), TYPE_bat, 0); if (prelude(cnt, subcommit) == 0) { /* save the new bats outside the lock */ /* lock just prevents BBPtrims, and other global (sub-)commits */ gdk_set_lock(GDKtrimLock, "TMsubcommit"); if (BBPsync(cnt, subcommit) == 0) { /* write BBP.dir (++) */ ret = epilogue(cnt, subcommit); } gdk_unset_lock(GDKtrimLock, "TMsubcommit"); } return ret;}@- TMabortTransaction abort is cheap. We use the delta statusesto go back to the previous version of each BAT. Alsofor BATs that are currently swapped out. Persistent BATs that were made transient in this transaction becomepersistent again.@cintTMabort(void){ int i; BBPlock("TMabort"); for (i = 1; i < BBPsize; i++) { if (BBP_status(i) & BBPNEW) { BAT *b = BBPquickdesc(i, FALSE); if (b) { if (b->batPersistence == PERSISTENT) BBPdecref(i, TRUE); b->batPersistence = TRANSIENT; b->batDirtydesc = 1; } } } for (i = 1; i < BBPsize; i++) { if (BBP_status(i) & (BBPPERSISTENT | BBPDELETED | BBPSWAPPED)) { BAT *b = BBPquickdesc(i, TRUE); if (b == NULL) continue; BBPfix(i); if (BATdirty(b) || DELTAdirty(b)) { /* BUN move-backes need a real BAT! */ /* Stefan: * Actually, in case DELTAdirty(b), i.e., a * BAT with differences that is * saved/swapped-out but not yet committed, * we (AFAIK) don't have to load the BAT and * apply the undo, but rather could simply * discard the delta and revive the backup; * however, I don't know how to do this * (yet), hence we stick with this solution * for the time being --- it should be * correct though it might not be the most * efficient way... */ b = BBPdescriptor(i); BATundo(b); } if (BBP_status(i) & BBPDELETED) { BBP_status_on(i, BBPEXISTING, "TMabort"); if (b->batPersistence != PERSISTENT) BBPincref(i, TRUE); b->batPersistence = PERSISTENT; b->batDirtydesc = 1; } BBPunfix(i); } BBP_status_off(i, BBPDELETED | BBPSWAPPED | BBPNEW, "TMabort"); } BBPunlock("TMabort"); return 0;}@}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -