plan
来自「基于组件方式开发操作系统的OSKIT源代码」· 代码 · 共 125 行
TXT
125 行
Levels of optimization: 0. none - simple, fast, should always win 1. hard-wire some jumps - requires SMC (and hence a bit of cache flushing), but should only add a constant overhead to new filter insertion. Cost is insensitive to number of protocols or filters. 2. adjacency: blt code into a contigious vector: cost grows with number of filters 3. does inter-atom scheduling: cost grows with number of filters. fastest. 4. naive opt: regen from whole trie. is as efficient as 2 and uses less space than 2 or 3, but has high overhead.Base code generation:---------------------The major overhead with filter insertion/deletion is updating jumpswhose targets have changed as a result of adding additional code. Thiscost can, in bad cases, be proportional to the number of filters. Tocontrol this, we dynamically compute the nodes to jump too on successand failure of a given boolean (i.e., computation is done on everydemux). This trades demux overhead for more efficient triemodification. A sketch of the algorithm follows.Each atom is associated with a chunk of code. This code computes theexpression the atom represents. At a high-level this code is augmentedin two ways: on failure it jumps to a pc given by the ``failure pcregister'' and, on success, jumps to a pc it has computed. In somecases, it also contains code to both save the current pc to jump too onfailure and load it with a pointer a new failure pc. At a low-level,this later code is different in four cases: (1) if the current node hasa pid, (2) if the current node is part of an or list, and (3) if thecurrent node is a shift. We discuss each in turn.Current node has a pid:-----------------------If the atom contains a pid and has subsequent children we load the``failure return register'' with a pointer to code to return this pid.The original pc does not need to be saved since, from this pointforward, we will always succeed.Current node is a shift:------------------------The ``failure return register'' is saved, and a pointer to restore themessage pointer is loaded. This later code is necessary since shift'sdestructively alter the msg pointer (which may be needed by otherfilters).Current node is an or:----------------------The first or will save failure reg. Both it and all middle orsoverwrite it with the pc of the subsequent or node's code. The lastnode will reload the failure reg.Whenever a new or is added to the tail or head of the or list, theprevious tail/head must be demoted.Handling success:------------------The pc to jump to is the pc of the first child's code. This is loadedusing the kid pointer.Both code and data are unified into a single piece of memory. Thisallows pc computations do be done using the data pointers already inplace (the pc is computed by adding a constant offset to thesepointers). When code is regenerated it must be recopied into thisarea. To save hassle of resituating atoms we overallocate. The failure register values are saved on the stack in a pre-allocatedarray. Offsets are computed: (reg_area_offset + filter_depth * sizeof(void *)) (sp)Code forms:-----------The code for an eq looks something like: load success pointer [ compute boolean ] bool == true goto l # failed: jump out j failure_register l: j success_registerThe prologue code for the first or is the code to save failure_reg: store failure_reg, (reg_offset + filter_depth * sizeof(void *)) (sp)The prologue code to compute new failure reg loads the next code pointer using the next pointer already in place and adding the required offset: load failure_reg, (pc - offset(code) + offset(lh_first)) add failure_reg, offset(code)The epilogue code of the last or node: load failure_reg, (reg_offset + filter_depth * sizeof(void *)) (sp)If the current node has a pid: load failure_reg, l [ code for atom goes here ] l: return pid;Code gen issues:------------------This method requires incremental linking: + the epilogue label cannot be killed during linking. + register context must stay ok across links. + first time dpf is started up we generate prologue and epilogue code. must have a jump in this code to the first code fragment. Can special case a bunch of stuff with added complexity, but no real runtime overhead. Should probably think about treating the hash tables in a more sophisticated way.
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?