📄 mc-tech-docs.html
字号:
the condition which applies. The encoding is as it is in the x86 insn stream, except we add a 17th value <code class="computeroutput">CondAlways</code> to indicate an unconditional transfer.</p></li><li><p>Various 1-bit flags, indicating whether this insn pertains to an x86 CALL or RET instruction, whether a widening is signed or not, etc.</p></li></ul></div><p>UOpcodes (type <code class="computeroutput">Opcode</code>) aredivided into two groups: those necessary merely to express thefunctionality of the x86 code, and extra uopcodes needed toexpress the instrumentation. The former group contains:</p><div class="itemizedlist"><ul type="disc"><li><p><code class="computeroutput">GET</code> and <code class="computeroutput">PUT</code>, which move values from the simulated CPU's integer registers (<code class="computeroutput">ArchReg</code>s) into <code class="computeroutput">TempReg</code>s, and back. <code class="computeroutput">GETF</code> and <code class="computeroutput">PUTF</code> do the corresponding thing for the simulated <code class="computeroutput">%EFLAGS</code>. There are no corresponding insns for the FPU register stack, since we don't explicitly simulate its registers.</p></li><li><p><code class="computeroutput">LOAD</code> and <code class="computeroutput">STORE</code>, which, in RISC-like fashion, are the only uinstrs able to interact with memory.</p></li><li><p><code class="computeroutput">MOV</code> and <code class="computeroutput">CMOV</code> allow unconditional and conditional moves of values between <code class="computeroutput">TempReg</code>s.</p></li><li><p>ALU operations. Again in RISC-like fashion, these only operate on <code class="computeroutput">TempReg</code>s (before reg-alloc) or <code class="computeroutput">RealReg</code>s (after reg-alloc). These are: <code class="computeroutput">ADD</code>, <code class="computeroutput">ADC</code>, <code class="computeroutput">AND</code>, <code class="computeroutput">OR</code>, <code class="computeroutput">XOR</code>, <code class="computeroutput">SUB</code>, <code class="computeroutput">SBB</code>, <code class="computeroutput">SHL</code>, <code class="computeroutput">SHR</code>, <code class="computeroutput">SAR</code>, <code class="computeroutput">ROL</code>, <code class="computeroutput">ROR</code>, <code class="computeroutput">RCL</code>, <code class="computeroutput">RCR</code>, <code class="computeroutput">NOT</code>, <code class="computeroutput">NEG</code>, <code class="computeroutput">INC</code>, <code class="computeroutput">DEC</code>, <code class="computeroutput">BSWAP</code>, <code class="computeroutput">CC2VAL</code> and <code class="computeroutput">WIDEN</code>. <code class="computeroutput">WIDEN</code> does signed or unsigned value widening. <code class="computeroutput">CC2VAL</code> is used to convert condition codes into a value, zero or one. The rest are obvious.</p><p>To allow for more efficient code generation, we bend slightly the restriction at the start of the previous para: for <code class="computeroutput">ADD</code>, <code class="computeroutput">ADC</code>, <code class="computeroutput">XOR</code>, <code class="computeroutput">SUB</code> and <code class="computeroutput">SBB</code>, we allow the first (source) operand to also be an <code class="computeroutput">ArchReg</code>, that is, one of the simulated machine's registers. Also, many of these ALU ops allow the source operand to be a literal. See <code class="computeroutput">VG_(saneUInstr)</code> for the final word on the allowable forms of uinstrs.</p></li><li><p><code class="computeroutput">LEA1</code> and <code class="computeroutput">LEA2</code> are not strictly necessary, but allow faciliate better translations. They record the fancy x86 addressing modes in a direct way, which allows those amodes to be emitted back into the final instruction stream more or less verbatim.</p></li><li><p><code class="computeroutput">CALLM</code> calls a machine-code helper, one of the methods whose address is stored at some <code class="computeroutput">VG_(baseBlock)</code> offset. <code class="computeroutput">PUSH</code> and <code class="computeroutput">POP</code> move values to/from <code class="computeroutput">TempReg</code> to the real (Valgrind's) stack, and <code class="computeroutput">CLEAR</code> removes values from the stack. <code class="computeroutput">CALLM_S</code> and <code class="computeroutput">CALLM_E</code> delimit the boundaries of call setups and clearings, for the benefit of the instrumentation passes. Getting this right is critical, and so <code class="computeroutput">VG_(saneUCodeBlock)</code> makes various checks on the use of these uopcodes.</p><p>It is important to understand that these uopcodes have nothing to do with the x86 <code class="computeroutput">call</code>, <code class="computeroutput">return,</code> <code class="computeroutput">push</code> or <code class="computeroutput">pop</code> instructions, and are not used to implement them. Those guys turn into combinations of <code class="computeroutput">GET</code>, <code class="computeroutput">PUT</code>, <code class="computeroutput">LOAD</code>, <code class="computeroutput">STORE</code>, <code class="computeroutput">ADD</code>, <code class="computeroutput">SUB</code>, and <code class="computeroutput">JMP</code>. What these uopcodes support is calling of helper functions such as <code class="computeroutput">VG_(helper_imul_32_64)</code>, which do stuff which is too difficult or tedious to emit inline.</p></li><li><p><code class="computeroutput">FPU</code>, <code class="computeroutput">FPU_R</code> and <code class="computeroutput">FPU_W</code>. Valgrind doesn't attempt to simulate the internal state of the FPU at all. Consequently it only needs to be able to distinguish FPU ops which read and write memory from those that don't, and for those which do, it needs to know the effective address and data transfer size. This is made easier because the x86 FP instruction encoding is very regular, basically consisting of 16 bits for a non-memory FPU insn and 11 (IIRC) bits + an address mode for a memory FPU insn. So our <code class="computeroutput">FPU</code> uinstr carries the 16 bits in its <code class="computeroutput">val1</code> field. And <code class="computeroutput">FPU_R</code> and <code class="computeroutput">FPU_W</code> carry 11 bits in that field, together with the identity of a <code class="computeroutput">TempReg</code> or (later) <code class="computeroutput">RealReg</code> which contains the address.</p></li><li><p><code class="computeroutput">JIFZ</code> is unique, in that it allows a control-flow transfer which is not deemed to end a basic block. It causes a jump to a literal (original) address if the specified argument is zero.</p></li><li><p>Finally, <code class="computeroutput">INCEIP</code> advances the simulated <code class="computeroutput">%EIP</code> by the specified literal amount. This supports lazy <code class="computeroutput">%EIP</code> updating, as described below.</p></li></ul></div><p>Stages 1 and 2 of the 6-stage translation process mentionedabove deal purely with these uopcodes, and no others. They aresufficient to express pretty much all the x86 32-bitprotected-mode instruction set, at least everything understood bya pre-MMX original Pentium (P54C).</p><p>Stages 3, 4, 5 and 6 also deal with the following extra"instrumentation" uopcodes. They are used to express all thedefinedness-tracking and -checking machinery which valgrind does.In later sections we show how to create checking code for each ofthe uopcodes above. Note that these instrumentation uopcodes,although some appearing complicated, have been carefully chosenso that efficient x86 code can be generated for them. GNUsuperopt v2.5 did a great job helping out here. Anyways, theuopcodes are as follows:</p><div class="itemizedlist"><ul type="disc"><li><p><code class="computeroutput">GETV</code> and <code class="computeroutput">PUTV</code> are analogues to <code class="computeroutput">GET</code> and <code class="computeroutput">PUT</code> above. They are identical except that they move the V bits for the specified values back and forth to <code class="computeroutput">TempRegs</code>, rather than moving the values themselves.</p></li><li><p>Similarly, <code class="computeroutput">LOADV</code> and <code class="computeroutput">STOREV</code> read and write V bits from the synthesised shadow memory that Valgrind maintains. In fact they do more than that, since they also do address-validity checks, and emit complaints if the read/written addresses are unaddressible.</p></li><li><p><code class="computeroutput">TESTV</code>, whose parameters are a <code class="computeroutput">TempReg</code> and a size, tests the V bits in the <code class="computeroutput">TempReg</code>, at the specified operation size (0/1/2/4 byte) and emits an error if any of them indicate undefinedness. This is the only uopcode capable of doing such tests.</p></li><li><p><code class="computeroutput">SETV</code>, whose parameters are also <code class="computeroutput">TempReg</code> and a size, makes the V bits in the <code class="computeroutput">TempReg</code> indicated definedness, at the specified operation size. This is usually used to generate the correct V bits for a literal value, which is of course fully defined.</p></li><li><p><code class="computeroutput">GETVF</code> and <code class="computeroutput">PUTVF</code> are analogues to <code class="computeroutput">GETF</code> and <code class="computeroutput">PUTF</code>. They move the single V bit used to model definedness of <code class="computeroutput">%EFLAGS</code> between its home in <code class="computeroutput">VG_(baseBlock)</code> and the specified <code class="computeroutput">TempReg</code>.</p></li><li><p><code class="computeroutput">TAG1</code> denotes one of a family of unary operations on <code class="computeroutput">TempReg</code>s containing V bits. Similarly, <code class="computeroutput">TAG2</code> denotes one in a family of binary operations on V bits.</p></li></ul></div><p>These 10 uopcodes are sufficient to express Valgrind'sentire definedness-checking semantics. In fact most of theinteresting magic is done by the<code class="computeroutput">TAG1</code> and<code class="computeroutput">TAG2</code> suboperations.</p><p>First, however, I need to explain about V-vector operationsizes. There are 4 sizes: 1, 2 and 4, which operate on groups of8, 16 and 32 V bits at a time, supporting the usual 1, 2 and 4byte x86 operations. However there is also the mysterious size0, which really means a single V bit. Single V bits are used invarious circumstances; in particular, the definedness of<code class="computeroutput">%EFLAGS</code> is modelled with asingle V bit. Now might be a good time to also point out thatfor V bits, 1 means "undefined" and 0 means "defined".Similarly, for A bits, 1 means "invalid address" and 0 means"valid address". This seems counterintuitive (and so it is), buttesting against zero on x86s saves instructions compared totesting against all 1s, because many ALU operations set the Zflag for free, so to speak.</p><p>With that in mind, the tag ops are:</p><div class="itemizedlist"><ul type="disc"><li><p><b>(UNARY) Pessimising casts:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -