⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 nbtpage.c

📁 postgresql8.3.4源码,开源数据库
💻 C
📖 第 1 页 / 共 3 页
字号:
	targetkey = CopyIndexTuple((IndexTuple) PageGetItem(page, itemid));	/*	 * To avoid deadlocks, we'd better drop the target page lock before going	 * further.	 */	_bt_relbuf(rel, buf);	/*	 * We need an approximate pointer to the page's parent page.  We use the	 * standard search mechanism to search for the page's high key; this will	 * give us a link to either the current parent or someplace to its left	 * (if there are multiple equal high keys).  In recursion cases, the	 * caller already generated a search stack and we can just re-use that	 * work.	 */	if (stack == NULL)	{		if (!InRecovery)		{			/* we need an insertion scan key to do our search, so build one */			itup_scankey = _bt_mkscankey(rel, targetkey);			/* find the leftmost leaf page containing this key */			stack = _bt_search(rel, rel->rd_rel->relnatts, itup_scankey, false,							   &lbuf, BT_READ);			/* don't need a pin on that either */			_bt_relbuf(rel, lbuf);			/*			 * If we are trying to delete an interior page, _bt_search did			 * more than we needed.  Locate the stack item pointing to our			 * parent level.			 */			ilevel = 0;			for (;;)			{				if (stack == NULL)					elog(ERROR, "not enough stack items");				if (ilevel == targetlevel)					break;				stack = stack->bts_parent;				ilevel++;			}		}		else		{			/*			 * During WAL recovery, we can't use _bt_search (for one reason,			 * it might invoke user-defined comparison functions that expect			 * facilities not available in recovery mode).	Instead, just set			 * up a dummy stack pointing to the left end of the parent tree			 * level, from which _bt_getstackbuf will walk right to the parent			 * page.  Painful, but we don't care too much about performance in			 * this scenario.			 */			pbuf = _bt_get_endpoint(rel, targetlevel + 1, false);			stack = (BTStack) palloc(sizeof(BTStackData));			stack->bts_blkno = BufferGetBlockNumber(pbuf);			stack->bts_offset = InvalidOffsetNumber;			/* bts_btentry will be initialized below */			stack->bts_parent = NULL;			_bt_relbuf(rel, pbuf);		}	}	/*	 * We cannot delete a page that is the rightmost child of its immediate	 * parent, unless it is the only child --- in which case the parent has to	 * be deleted too, and the same condition applies recursively to it. We	 * have to check this condition all the way up before trying to delete. We	 * don't need to re-test when deleting a non-leaf page, though.	 */	if (targetlevel == 0 &&		!_bt_parent_deletion_safe(rel, target, stack))		return 0;	/*	 * We have to lock the pages we need to modify in the standard order:	 * moving right, then up.  Else we will deadlock against other writers.	 *	 * So, we need to find and write-lock the current left sibling of the	 * target page.  The sibling that was current a moment ago could have	 * split, so we may have to move right.  This search could fail if either	 * the sibling or the target page was deleted by someone else meanwhile;	 * if so, give up.	(Right now, that should never happen, since page	 * deletion is only done in VACUUM and there shouldn't be multiple VACUUMs	 * concurrently on the same table.)	 */	if (leftsib != P_NONE)	{		lbuf = _bt_getbuf(rel, leftsib, BT_WRITE);		page = BufferGetPage(lbuf);		opaque = (BTPageOpaque) PageGetSpecialPointer(page);		while (P_ISDELETED(opaque) || opaque->btpo_next != target)		{			/* step right one page */			leftsib = opaque->btpo_next;			_bt_relbuf(rel, lbuf);			if (leftsib == P_NONE)			{				elog(LOG, "no left sibling (concurrent deletion?) in \"%s\"",					 RelationGetRelationName(rel));				return 0;			}			lbuf = _bt_getbuf(rel, leftsib, BT_WRITE);			page = BufferGetPage(lbuf);			opaque = (BTPageOpaque) PageGetSpecialPointer(page);		}	}	else		lbuf = InvalidBuffer;	/*	 * Next write-lock the target page itself.	It should be okay to take just	 * a write lock not a superexclusive lock, since no scans would stop on an	 * empty page.	 */	buf = _bt_getbuf(rel, target, BT_WRITE);	page = BufferGetPage(buf);	opaque = (BTPageOpaque) PageGetSpecialPointer(page);	/*	 * Check page is still empty etc, else abandon deletion.  The empty check	 * is necessary since someone else might have inserted into it while we	 * didn't have it locked; the others are just for paranoia's sake.	 */	if (P_RIGHTMOST(opaque) || P_ISROOT(opaque) || P_ISDELETED(opaque) ||		P_FIRSTDATAKEY(opaque) <= PageGetMaxOffsetNumber(page))	{		_bt_relbuf(rel, buf);		if (BufferIsValid(lbuf))			_bt_relbuf(rel, lbuf);		return 0;	}	if (opaque->btpo_prev != leftsib)		elog(ERROR, "left link changed unexpectedly in block %u of index \"%s\"",			 target, RelationGetRelationName(rel));	/*	 * And next write-lock the (current) right sibling.	 */	rightsib = opaque->btpo_next;	rbuf = _bt_getbuf(rel, rightsib, BT_WRITE);	/*	 * Next find and write-lock the current parent of the target page. This is	 * essentially the same as the corresponding step of splitting.	 */	ItemPointerSet(&(stack->bts_btentry.t_tid), target, P_HIKEY);	pbuf = _bt_getstackbuf(rel, stack, BT_WRITE);	if (pbuf == InvalidBuffer)		elog(ERROR, "failed to re-find parent key in index \"%s\" for deletion target page %u",			 RelationGetRelationName(rel), target);	parent = stack->bts_blkno;	poffset = stack->bts_offset;	/*	 * If the target is the rightmost child of its parent, then we can't	 * delete, unless it's also the only child --- in which case the parent	 * changes to half-dead status.  The "can't delete" case should have been	 * detected by _bt_parent_deletion_safe, so complain if we see it now.	 */	page = BufferGetPage(pbuf);	opaque = (BTPageOpaque) PageGetSpecialPointer(page);	maxoff = PageGetMaxOffsetNumber(page);	parent_half_dead = false;	parent_one_child = false;	if (poffset >= maxoff)	{		if (poffset == P_FIRSTDATAKEY(opaque))			parent_half_dead = true;		else			elog(ERROR, "failed to delete rightmost child %u of block %u in index \"%s\"",				 target, parent, RelationGetRelationName(rel));	}	else	{		/* Will there be exactly one child left in this parent? */		if (OffsetNumberNext(P_FIRSTDATAKEY(opaque)) == maxoff)			parent_one_child = true;	}	/*	 * If we are deleting the next-to-last page on the target's level, then	 * the rightsib is a candidate to become the new fast root. (In theory, it	 * might be possible to push the fast root even further down, but the odds	 * of doing so are slim, and the locking considerations daunting.)	 *	 * We don't support handling this in the case where the parent is becoming	 * half-dead, even though it theoretically could occur.	 *	 * We can safely acquire a lock on the metapage here --- see comments for	 * _bt_newroot().	 */	if (leftsib == P_NONE && !parent_half_dead)	{		page = BufferGetPage(rbuf);		opaque = (BTPageOpaque) PageGetSpecialPointer(page);		Assert(opaque->btpo.level == targetlevel);		if (P_RIGHTMOST(opaque))		{			/* rightsib will be the only one left on the level */			metabuf = _bt_getbuf(rel, BTREE_METAPAGE, BT_WRITE);			metapg = BufferGetPage(metabuf);			metad = BTPageGetMeta(metapg);			/*			 * The expected case here is btm_fastlevel == targetlevel+1; if			 * the fastlevel is <= targetlevel, something is wrong, and we			 * choose to overwrite it to fix it.			 */			if (metad->btm_fastlevel > targetlevel + 1)			{				/* no update wanted */				_bt_relbuf(rel, metabuf);				metabuf = InvalidBuffer;			}		}	}	/*	 * Here we begin doing the deletion.	 */	/* No ereport(ERROR) until changes are logged */	START_CRIT_SECTION();	/*	 * Update parent.  The normal case is a tad tricky because we want to	 * delete the target's downlink and the *following* key.  Easiest way is	 * to copy the right sibling's downlink over the target downlink, and then	 * delete the following item.	 */	page = BufferGetPage(pbuf);	opaque = (BTPageOpaque) PageGetSpecialPointer(page);	if (parent_half_dead)	{		PageIndexTupleDelete(page, poffset);		opaque->btpo_flags |= BTP_HALF_DEAD;	}	else	{		OffsetNumber nextoffset;		itemid = PageGetItemId(page, poffset);		itup = (IndexTuple) PageGetItem(page, itemid);		Assert(ItemPointerGetBlockNumber(&(itup->t_tid)) == target);		ItemPointerSet(&(itup->t_tid), rightsib, P_HIKEY);		nextoffset = OffsetNumberNext(poffset);		/* This part is just for double-checking */		itemid = PageGetItemId(page, nextoffset);		itup = (IndexTuple) PageGetItem(page, itemid);		if (ItemPointerGetBlockNumber(&(itup->t_tid)) != rightsib)			elog(PANIC, "right sibling %u of block %u is not next child of %u in index \"%s\"",				 rightsib, target, BufferGetBlockNumber(pbuf),				 RelationGetRelationName(rel));		PageIndexTupleDelete(page, nextoffset);	}	/*	 * Update siblings' side-links.  Note the target page's side-links will	 * continue to point to the siblings.	 */	if (BufferIsValid(lbuf))	{		page = BufferGetPage(lbuf);		opaque = (BTPageOpaque) PageGetSpecialPointer(page);		Assert(opaque->btpo_next == target);		opaque->btpo_next = rightsib;	}	page = BufferGetPage(rbuf);	opaque = (BTPageOpaque) PageGetSpecialPointer(page);	Assert(opaque->btpo_prev == target);	opaque->btpo_prev = leftsib;	rightsib_empty = (P_FIRSTDATAKEY(opaque) > PageGetMaxOffsetNumber(page));	/*	 * Mark the page itself deleted.  It can be recycled when all current	 * transactions are gone; or immediately if we're doing VACUUM FULL.	 */	page = BufferGetPage(buf);	opaque = (BTPageOpaque) PageGetSpecialPointer(page);	opaque->btpo_flags &= ~BTP_HALF_DEAD;	opaque->btpo_flags |= BTP_DELETED;	opaque->btpo.xact =		vacuum_full ? FrozenTransactionId : ReadNewTransactionId();	/* And update the metapage, if needed */	if (BufferIsValid(metabuf))	{		metad->btm_fastroot = rightsib;		metad->btm_fastlevel = targetlevel;		MarkBufferDirty(metabuf);	}	/* Must mark buffers dirty before XLogInsert */	MarkBufferDirty(pbuf);	MarkBufferDirty(rbuf);	MarkBufferDirty(buf);	if (BufferIsValid(lbuf))		MarkBufferDirty(lbuf);	/* XLOG stuff */	if (!rel->rd_istemp)	{		xl_btree_delete_page xlrec;		xl_btree_metadata xlmeta;		uint8		xlinfo;		XLogRecPtr	recptr;		XLogRecData rdata[5];		XLogRecData *nextrdata;		xlrec.target.node = rel->rd_node;		ItemPointerSet(&(xlrec.target.tid), parent, poffset);		xlrec.deadblk = target;		xlrec.leftblk = leftsib;		xlrec.rightblk = rightsib;		rdata[0].data = (char *) &xlrec;		rdata[0].len = SizeOfBtreeDeletePage;		rdata[0].buffer = InvalidBuffer;		rdata[0].next = nextrdata = &(rdata[1]);		if (BufferIsValid(metabuf))		{			xlmeta.root = metad->btm_root;			xlmeta.level = metad->btm_level;			xlmeta.fastroot = metad->btm_fastroot;			xlmeta.fastlevel = metad->btm_fastlevel;			nextrdata->data = (char *) &xlmeta;			nextrdata->len = sizeof(xl_btree_metadata);			nextrdata->buffer = InvalidBuffer;			nextrdata->next = nextrdata + 1;			nextrdata++;			xlinfo = XLOG_BTREE_DELETE_PAGE_META;		}		else if (parent_half_dead)			xlinfo = XLOG_BTREE_DELETE_PAGE_HALF;		else			xlinfo = XLOG_BTREE_DELETE_PAGE;		nextrdata->data = NULL;		nextrdata->len = 0;		nextrdata->next = nextrdata + 1;		nextrdata->buffer = pbuf;		nextrdata->buffer_std = true;		nextrdata++;		nextrdata->data = NULL;		nextrdata->len = 0;		nextrdata->buffer = rbuf;		nextrdata->buffer_std = true;		nextrdata->next = NULL;		if (BufferIsValid(lbuf))		{			nextrdata->next = nextrdata + 1;			nextrdata++;			nextrdata->data = NULL;			nextrdata->len = 0;			nextrdata->buffer = lbuf;			nextrdata->buffer_std = true;			nextrdata->next = NULL;		}		recptr = XLogInsert(RM_BTREE_ID, xlinfo, rdata);		if (BufferIsValid(metabuf))		{			PageSetLSN(metapg, recptr);			PageSetTLI(metapg, ThisTimeLineID);		}		page = BufferGetPage(pbuf);		PageSetLSN(page, recptr);		PageSetTLI(page, ThisTimeLineID);		page = BufferGetPage(rbuf);		PageSetLSN(page, recptr);		PageSetTLI(page, ThisTimeLineID);		page = BufferGetPage(buf);		PageSetLSN(page, recptr);		PageSetTLI(page, ThisTimeLineID);		if (BufferIsValid(lbuf))		{			page = BufferGetPage(lbuf);			PageSetLSN(page, recptr);			PageSetTLI(page, ThisTimeLineID);		}	}	END_CRIT_SECTION();	/* release metapage; send out relcache inval if metapage changed */	if (BufferIsValid(metabuf))	{		CacheInvalidateRelcache(rel);		_bt_relbuf(rel, metabuf);	}	/* can always release leftsib immediately */	if (BufferIsValid(lbuf))		_bt_relbuf(rel, lbuf);	/*	 * If parent became half dead, recurse to delete it. Otherwise, if right	 * sibling is empty and is now the last child of the parent, recurse to	 * try to delete it.  (These cases cannot apply at the same time, though	 * the second case might itself recurse to the first.)	 *	 * When recursing to parent, we hold the lock on the target page until	 * done.  This delays any insertions into the keyspace that was just	 * effectively reassigned to the parent's right sibling.  If we allowed	 * that, and there were enough such insertions before we finish deleting	 * the parent, page splits within that keyspace could lead to inserting	 * out-of-order keys into the grandparent level.  It is thought that that	 * wouldn't have any serious consequences, but it still seems like a	 * pretty bad idea.	 */	if (parent_half_dead)	{		/* recursive call will release pbuf */		_bt_relbuf(rel, rbuf);		result = _bt_pagedel(rel, pbuf, stack->bts_parent, vacuum_full) + 1;		_bt_relbuf(rel, buf);	}	else if (parent_one_child && rightsib_empty)	{		_bt_relbuf(rel, pbuf);		_bt_relbuf(rel, buf);		/* recursive call will release rbuf */		result = _bt_pagedel(rel, rbuf, stack, vacuum_full) + 1;	}	else	{		_bt_relbuf(rel, pbuf);		_bt_relbuf(rel, buf);		_bt_relbuf(rel, rbuf);		result = 1;	}	return result;}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -