📄 vcsk.c
字号:
/* * Copyright (C) 1998, 1999, 2001, Jonathan S. Shapiro. * * This file is part of the EROS Operating System. * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version 2, * or (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. *//* VCSK -- Virtual copy space keeper/Zero space keeper (they are actually the same! When a fault occurs on a frozen VCS, the fault is simply punted to the domain keeper. When a fault occurs on a non-frozen VCS, the action taken depends on the fault type: Read Fault -- space is expanded appropriately using capage capabilities into the primordial zero space. Write fault -- all immutable capages on the path from root to offset are replaced with mutable capages until an immutable page is found. A new page is acquired from a space bank and the content of the old page is copied into the new page. The capage tree for a VCS contains only the following: + sensory capages keys to the background space. + mutable keys to capages that have already been copied. + RO page keys. + RW page keys to pages that have already been copied. + A single red space capage, which is the root of the VCS. You might think that this ought to be built using background windows -- I certainly did. Unfortunately, the background window approach has an undesirable consequence -- the translation of pages that have NOT been copied requires a traversal that is a function of the tree depth. Norm points out that it is necessary to have a distinct VCSK for each active kept segment, to avoid the situation in which a single VCSK might be come inoperative due to blocking on a dismounted page. VCSK doesn't quite start up as you might expect. Generally, it is the product of a factory, and returns a segment key as it's "yield" The special case is that the zero segment keeper (which is the very first, primordial keeper) does not do this, but initializes directly into "frozen" mode. It recognizes this case by the fact that a pre-known constituent slot holds something other than void. */#include <eros/target.h>#include <eros/Invoke.h>#include <eros/NumberKey.h>#include <eros/NodeKey.h>#include <eros/PageKey.h>#include <eros/ProcessKey.h>#include <eros/StdKeyType.h>#include <eros/ReturnerKey.h>#include <eros/Key.h>#include <domain/VcskKey.h>#include <domain/domdbg.h>#include <domain/SpaceBankKey.h>#include <domain/ProtoSpace.h>#include <domain/Runtime.h>#define dbg_init 0x1#define dbg_invalid 0x2#define dbg_access 0x4#define dbg_op 0x8#define dbg_destroy 0x10#define dbg_returns 0x20/* Following should be an OR of some of the above */#define dbg_flags ( 0u )#define DEBUG(x) if (dbg_##x & dbg_flags)#define KR_DOMCRE 3#define KR_INITSEG 5 /* used in primordia init */#define KR_SCRATCH 6#define KR_SCRATCH2 7#define KR_ZINDEX 9 /* index of primordial zero space */ #define KR_OSTREAM 10#define KR_RETURNER 11#define KR_RK0 28#define KR_RK1 29#define KR_RK2 30#define KR_RESUME 31 /* often a fault key */#define KR_NEWOBJ KR_RK0#define KR_SEGMENT KR_RK2#define KR_L1_NODE 12 /* last L1 node we COW'd a page into */#define KR_PG_CACHE 12 /* really 13, 14, 15 */#define KC_OSTREAM 0#define KC_RETURNER 1#define KC_ZINDEX 2#define KC_FROZEN_SEG 3#define KC_PROTOSPC 4/* POLICY NOTE: The original design was intended to grow by one page when zero-extending the segment. Newer operating systems frequently grow by more than this. The following tunable may be set to anything in the range 1..EROS_NODE_SIZE to control how many pages the segment grows by. Note that this could (and should) be made into a user-controllable variable -- I just haven't bothered yet. */#define ZERO_EXTEND_BY 2/* If this was a direct invocation of our start key, RK0..RESUME hold whatever the user passed. If this was a pass-through invocation via a red space key, these hold what the user passed except that KR_RK2 may have been replaced by a capage key to the red space capage. If this was a kernel-generated fault invocation, KR_RK0 and KR_RK1 hold zero data keys, KR_RK2 holds whatever the upcall pass-through convention specified, and KR_RESUME holds a fault key. *//* IMPORTANT optimization: VCSK serves to implement both demand-copy and demand-zero segments. Of the cycles spent invoking capabilities, it proves that about 45% of them are spent in page_clone, and another 45% in range key calls (done by the space bank). The only call to page_clone is here. It is unavoidable when we are actually doing a virtual copy, but very much avoidable if we are doing demand-zero extension on an empty or short segment -- the page we get from the space bank is already zeroed. We therefore remember in the VCSK state the offset of the *end* of the last non-zero page. Anything past this is known to be zero. We take advantage of this to know when the page_clone() operation can be skipped. The variable is /first_zero_offset/. When a VCS is frozen, we stash this number in a number key in the red segment node, and reload it from there when the factory creates a new child node. At the moment I have not implemented this, because the VCSK code is also trying to be useful in providing demand extension of existing segments. There are a couple of initialization cases to worry about -- all straightforward -- but I haven't dealt with those yet. None of these are issues in the performance path. */#define true 1#define false 0typedef struct { bool_t was_access; uint64_t last_offset; /* last offset we modified */ uint64_t first_zero_offset; /* first known-zero offset */ int frozen; uint32_t npage;} state;void Sepuku();void DestroySegment(state *);typedef int bool;#define true 1#define false 0uint32_tBiasedLSS(uint64_t offset){ /* Shouldn't this be using fcs()? */ uint32_t bits = 0; uint32_t w0 = (uint32_t) offset; static unsigned hexbits[16] = {0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4 }; /* Run a decision tree: */ if (offset >= 0x100000000ull) { bits += 32; w0 = (offset >> 32); } if (w0 >= 0x10000ull) { bits += 16; w0 >>= 16; } if (w0 >= 0x100u) { bits += 8; w0 >>= 8; } if (w0 >= 0x10u) { bits += 4; w0 >>= 4; } /* Table lookup for the last part: */ bits += hexbits[w0]; if (bits < EROS_PAGE_ADDR_BITS) return EROS_PAGE_BLSS; bits -= EROS_PAGE_ADDR_BITS; bits += (EROS_NODE_LGSIZE - 1); bits /= EROS_NODE_LGSIZE; bits += EROS_PAGE_BLSS; return bits;} uint64_tBlssMask(uint32_t blss){ if (blss < EROS_PAGE_BLSS) return 0ull; {#if (EROS_PAGE_ADDR_BITS == (EROS_PAGE_BLSS * EROS_NODE_LGSIZE)) uint32_t bits_to_shift = blss * EROS_NODE_LGSIZE;#else uint32_t bits_to_shift = (blss - EROS_PAGE_BLSS) * EROS_NODE_LGSIZE + EROS_PAGE_ADDR_BITS; #endif if (bits_to_shift >= UINT64_BITS) return (uint64_t) -1; /* all 1's */ { uint64_t mask = (1ull << bits_to_shift); mask -= 1ull; return mask; } }}uint32_tBlssSlotNdx(uint64_t offset, uint32_t blss){ if (blss <= EROS_PAGE_BLSS) return 0; {#if (EROS_PAGE_ADDR_BITS == (EROS_PAGE_BLSS * EROS_NODE_LGSIZE)) uint32_t bits_to_shift = blss * EROS_NODE_LGSIZE;#else uint32_t bits_to_shift = (blss - EROS_PAGE_BLSS - 1) * EROS_NODE_LGSIZE + EROS_PAGE_ADDR_BITS; #endif if (bits_to_shift >= UINT64_BITS) return 0; return offset >> bits_to_shift; }}uint32_tAllocPage(state *pState){#if 1 if (pState->npage == 0) { if (spcbank_buy_data_pages(KR_BANK, 3, KR_PG_CACHE + 1, KR_PG_CACHE + 2, KR_PG_CACHE + 3) == RC_OK) { pState->npage = 3; } else if (spcbank_buy_data_pages(KR_BANK, 2, KR_PG_CACHE + 1, KR_PG_CACHE + 2, KR_VOID) == RC_OK) { pState->npage = 2; } else if (spcbank_buy_data_pages(KR_BANK, 1, KR_PG_CACHE + 1, KR_VOID, KR_VOID) == RC_OK) { pState->npage = 1; } else return KR_VOID; } { uint32_t kr = KR_PG_CACHE + pState->npage; pState->npage--; return kr; }#else if (spcbank_buy_data_pages(KR_BANK, 1, KR_NEWOBJ, KR_VOID, KR_VOID) == RC_OK) return KR_NEWOBJ; else return KR_VOID;#endif}uint32_tHandleSegmentFault(Message *pMsg, state *pState){ uint32_t slot = 0; uint32_t kt = RC_Void; uint32_t offsetBlss; uint32_t segBlss = EROS_PAGE_BLSS + 1; /* until otherwise proven */ uint64_t offset = ((uint64_t) pMsg->rcv_w3) << 32 | (uint64_t) pMsg->rcv_w2; offsetBlss = BiasedLSS(offset); switch (pMsg->rcv_w1) { case FC_InvalidAddr: { /* Subseg is too small */ /* OPTIMIZATION NOTE: The effect of FC_InvalidAddr is to zero-extend the segment. It therefore cannot alter the value of /first_zero_offset/. */ uint32_t result; uint16_t subsegBlss; pState->was_access = false; DEBUG(invalid) kdprintf(KR_OSTREAM, "FC_SegInvalidAddr at 0x%08x %08x\n", (uint32_t) (offset>>32), (uint32_t) offset); /* fetch out the subseg key */ node_copy(KR_SEGMENT, slot, KR_SCRATCH); /* find out it's BLSS: */ result = node_get_key_data(KR_SCRATCH, &subsegBlss); if (result == RC_UnknownRequest) subsegBlss = EROS_PAGE_BLSS; /* it must be a page key */ subsegBlss &= SEGMODE_BLSS_MASK; segBlss = subsegBlss + 1; DEBUG(invalid) kdprintf(KR_OSTREAM, "FC_SegInvalidAddr: segBlss %d offsetBlss %d\n", segBlss, offsetBlss); if (subsegBlss < offsetBlss) { /* Except at the top, we make no attempt to short-circuit the space tree. */ while (subsegBlss < offsetBlss) { DEBUG(invalid) kdprintf(KR_OSTREAM, " Growing: subsegblss %d offsetblss %d\n", subsegBlss, offsetBlss); /* Buy a new node to expand with: */ if (spcbank_buy_nodes(KR_BANK, 1, KR_NEWOBJ, KR_VOID, KR_VOID) != RC_OK) return RC_NoMoreNodes; /* Make that node have BLSS == subsegBlss+1: */ node_make_node_key(KR_NEWOBJ, subsegBlss+1, KR_NEWOBJ); /* Insert the old subseg into slot 0: */ node_swap(KR_NEWOBJ, 0, KR_SCRATCH, KR_VOID); /* Populate slots 1 and higher with suitable primordial zero subsegments. */ { uint32_t zindex_slot = subsegBlss; int i; node_copy(KR_ZINDEX, zindex_slot, KR_SCRATCH); for (i = 1; i < EROS_NODE_SIZE; i++) node_swap(KR_NEWOBJ, i, KR_SCRATCH, KR_VOID); } copy_key_reg(KR_RETURNER, KR_NEWOBJ, KR_SCRATCH); /* Finally, insert the new subseg into the original red segment */ node_swap(KR_SEGMENT, 0, KR_SCRATCH, KR_VOID); subsegBlss++; } if (subsegBlss >= segBlss) { /* Segment has grown. Rewrite the format key to reflect the new segment size. */ nk_value nkv; DEBUG(invalid) kdprintf(KR_OSTREAM, " Red seg must grow\n"); segBlss = subsegBlss + 1;#ifdef KT_Wrapper nkv.value[0] = WRAPPER_SEND_NODE | WRAPPER_KEEPER; WRAPPER_SET_BLSS(nkv, segBlss); nkv.value[1] = 0; nkv.value[2] = 0; node_write_number(KR_SEGMENT, WrapperFormat, &nkv);#else nkv.value[0] = 0; REDSEG_SET_INITIAL_SLOTS(nkv, 1); REDSEG_SET_SENDNODE(nkv, REDSEG_SEND_NODE); REDSEG_SET_KPR_SLOT(nkv, RedSegKeeper); REDSEG_SET_BG_SLOT(nkv, RedSegFormat); /* none */ REDSEG_SET_BLSS(nkv, segBlss); nkv.value[1] = 0; nkv.value[2] = 0; node_write_number(KR_SEGMENT, RedSegFormat, &nkv);#endif } DEBUG(returns) kdprintf(KR_OSTREAM, "Returning from FC_InvalidAddr after growing" " managed segment to blss=%d\n", subsegBlss); return RC_OK; } /* Segment is big enough, but some internal portion was unpopulated. Note (very important) that we have not yet clobbered KR_SEGMENT. */ DEBUG(invalid) kdprintf(KR_OSTREAM, "Invalid internal address -- falling through to COW logic for znode\n"); DEBUG(invalid) kprintf(KR_OSTREAM, " traverse slot %d, blss %d, offset 0x%08x %08x\n", slot, segBlss, (uint32_t) (offset >> 32), (uint32_t) offset); /* Traverse downward past all nodes. No need to check writability. Had that been the problem we would have gotten an access violation instead. This logic will work fine as long as the whole segment tree is nodes and pages, but will fail miserably is a subsegment is plugged in here somewhere. */ while (subsegBlss > EROS_PAGE_BLSS) { uint32_t w2; DEBUG(invalid) kdprintf(KR_OSTREAM, " Walking down: subsegBlss %d\n", subsegBlss); result = key_kt(KR_SCRATCH, &kt, &w2); if (result != RC_OK || kt != AKT_Node) { DEBUG(invalid) kdprintf(KR_OSTREAM, " subsegBlss %d invalid!\n", subsegBlss); break; } copy_key_reg(KR_RETURNER, KR_SCRATCH, KR_SEGMENT); slot = BlssSlotNdx(offset, subsegBlss); offset &= BlssMask(subsegBlss-1); DEBUG(invalid) kprintf(KR_OSTREAM, " traverse slot %d, blss %d, offset 0x%08x %08x\n", slot, subsegBlss, (uint32_t) (offset>>32), (uint32_t) offset); subsegBlss--; node_copy(KR_SEGMENT, slot, KR_SCRATCH); } /* Key in KR_SCRATCH is the read-only or invalid key. Key in KR_SEGMENT is it's parent. */ DEBUG(invalid) kdprintf(KR_OSTREAM, "Found rc=0x%x kt=0x%x at subsegBlss=%d\n", result, kt, subsegBlss); /* Replace the offending subsegment with a primordial zero segment of suitable size: */ { uint32_t zindex_slot = subsegBlss; node_copy(KR_ZINDEX, zindex_slot, KR_SCRATCH); node_swap(KR_SEGMENT, slot, KR_SCRATCH, KR_VOID); } DEBUG(returns) kdprintf(KR_OSTREAM, "Returning from FC_InvalidAddr after populating"
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -