📄 mc-tech-docs.html
字号:
flushed back into memory at the end of every basic block, so thatthe in-memory state is up-to-date between basic blocks. (Thisflushing is implied by the statement above that the realmachine's allocatable registers are dead in between simulatedblocks).</p></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="mc-tech-docs.startup"></a>1.2.2.燬tartup, shutdown, and system calls</h3></div></div></div><p>Getting into of Valgrind(<code class="computeroutput">VG_(startup)</code>, called from<code class="filename">valgrind.so</code>'s initialisation section),really means copying the real CPU's state into<code class="computeroutput">VG_(baseBlock)</code>, and theninstalling our own stack pointer, etc, into the real CPU, andthen starting up the JITter. Exiting valgrind involves copyingthe simulated state back to the real state.</p><p>Unfortunately, there's a complication at startup time.Problem is that at the point where we need to take a snapshot ofthe real CPU's state, the offsets in<code class="computeroutput">VG_(baseBlock)</code> are not set upyet, because to do so would involve disrupting the real machine'sstate significantly. The way round this is to dump the realmachine's state into a temporary, static block of memory,<code class="computeroutput">VG_(m_state_static)</code>. We canthen set up the <code class="computeroutput">VG_(baseBlock)</code>offsets at our leisure, and copy into it from<code class="computeroutput">VG_(m_state_static)</code> at someconvenient later time. This copying is done by<code class="computeroutput">VG_(copy_m_state_static_to_baseBlock)</code>.</p><p>On exit, the inverse transformation is (ratherunnecessarily) used: stuff in<code class="computeroutput">VG_(baseBlock)</code> is copied to<code class="computeroutput">VG_(m_state_static)</code>, and theassembly stub then copies from<code class="computeroutput">VG_(m_state_static)</code> into thereal machine registers.</p><p>Doing system calls on behalf of the client(<code class="filename">vg_syscall.S</code>) is something of a half-wayhouse. We have to make the world look sufficiently like thatwhich the client would normally have to make the syscall actuallywork properly, but we can't afford to lose control. So the trickis to copy all of the client's state, <span><strong class="command">except its programcounter</strong></span>, into the real CPU, do the system call, andcopy the state back out. Note that the client's state includesits stack pointer register, so one effect of this partialrestoration is to cause the system call to be run on the client'sstack, as it should be.</p><p>As ever there are complications. We have to save some ofour own state somewhere when restoring the client's state intothe CPU, so that we can keep going sensibly afterwards. In factthe only thing which is important is our own stack pointer, butfor paranoia reasons I save and restore our own FPU state aswell, even though that's probably pointless.</p><p>The complication on the above complication is, that forhorrible reasons to do with signals, we may have to handle asecond client system call whilst the client is blocked insidesome other system call (unbelievable!). That means there's twosets of places to dump Valgrind's stack pointer and FPU stateacross the syscall, and we decide which to use by consulting<code class="computeroutput">VG_(syscall_depth)</code>, which is inturn maintained by<code class="computeroutput">VG_(wrap_syscall)</code>.</p></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="mc-tech-docs.ucode"></a>1.2.3.營ntroduction to UCode</h3></div></div></div><p>UCode lies at the heart of the x86-to-x86 JITter. Thebasic premise is that dealing the the x86 instruction set head-onis just too darn complicated, so we do the traditionalcompiler-writer's trick and translate it into a simpler,easier-to-deal-with form.</p><p>In normal operation, translation proceeds through sixstages, coordinated by<code class="computeroutput">VG_(translate)</code>:</p><div class="orderedlist"><ol type="1"><li><p>Parsing of an x86 basic block into a sequence of UCode instructions (<code class="computeroutput">VG_(disBB)</code>).</p></li><li><p>UCode optimisation (<code class="computeroutput">vg_improve</code>), with the aim of caching simulated registers in real registers over multiple simulated instructions, and removing redundant simulated <code class="computeroutput">%EFLAGS</code> saving/restoring.</p></li><li><p>UCode instrumentation (<code class="computeroutput">vg_instrument</code>), which adds value and address checking code.</p></li><li><p>Post-instrumentation cleanup (<code class="computeroutput">vg_cleanup</code>), removing redundant value-check computations.</p></li><li><p>Register allocation (<code class="computeroutput">vg_do_register_allocation</code>), which, note, is done on UCode.</p></li><li><p>Emission of final instrumented x86 code (<code class="computeroutput">VG_(emit_code)</code>).</p></li></ol></div><p>Notice how steps 2, 3, 4 and 5 are simple UCode-to-UCodetransformation passes, all on straight-line blocks of UCode (type<code class="computeroutput">UCodeBlock</code>). Steps 2 and 4 areoptimisation passes and can be disabled for debugging purposes,with <code class="option">--optimise=no</code> and<code class="option">--cleanup=no</code> respectively.</p><p>Valgrind can also run in a no-instrumentation mode, given<code class="option">--instrument=no</code>. This is usefulfor debugging the JITter quickly without having to deal with thecomplexity of the instrumentation mechanism too. In this mode,steps 3 and 4 are omitted.</p><p>These flags combine, so that<code class="option">--instrument=no</code> together with<code class="option">--optimise=no</code> means only steps1, 5 and 6 are used.<code class="option">--single-step=yes</code> causes eachx86 instruction to be treated as a single basic block. Thetranslations are terrible but this is sometimes instructive.</p><p>The <code class="option">--stop-after=N</code> flagswitches back to the real CPU after<code class="computeroutput">N</code> basic blocks. It also re-JITsthe final basic block executed and prints the debugging inforesulting, so this gives you a way to get a quick snapshot of howa basic block looks as it passes through the six stages mentionedabove. If you want to see full information for every blocktranslated (probably not, but still ...) find, in<code class="computeroutput">VG_(translate)</code>, the lines</p><pre class="programlisting">dis = True;dis = debugging_translation;</pre><p>and comment out the second line. This will spew outdebugging junk faster than you can possibly imagine.</p></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="mc-tech-docs.tags"></a>1.2.4.燯Code operand tags: type <code class="computeroutput">Tag</code></h3></div></div></div><p>UCode is, more or less, a simple two-address RISC-likecode. In keeping with the x86 AT&T assembly syntax,generally speaking the first operand is the source operand, andthe second is the destination operand, which is modified when theuinstr is notionally executed.</p><p>UCode instructions have up to three operand fields, each ofwhich has a corresponding <code class="computeroutput">Tag</code>describing it. Possible values for the tag are:</p><div class="itemizedlist"><ul type="disc"><li><p><code class="computeroutput">NoValue</code>: indicates that the field is not in use.</p></li><li><p><code class="computeroutput">Lit16</code>: the field contains a 16-bit literal.</p></li><li><p><code class="computeroutput">Literal</code>: the field denotes a 32-bit literal, whose value is stored in the <code class="computeroutput">lit32</code> field of the uinstr itself. Since there is only one <code class="computeroutput">lit32</code> for the whole uinstr, only one operand field may contain this tag.</p></li><li><p><code class="computeroutput">SpillNo</code>: the field contains a spill slot number, in the range 0 to 23 inclusive, denoting one of the spill slots contained inside <code class="computeroutput">VG_(baseBlock)</code>. Such tags only exist after register allocation.</p></li><li><p><code class="computeroutput">RealReg</code>: the field contains a number in the range 0 to 7 denoting an integer x86 ("real") register on the host. The number is the Intel encoding for integer registers. Such tags only exist after register allocation.</p></li><li><p><code class="computeroutput">ArchReg</code>: the field contains a number in the range 0 to 7 denoting an integer x86 register on the simulated CPU. In reality this means a reference to one of the first 8 words of <code class="computeroutput">VG_(baseBlock)</code>. Such tags can exist at any point in the translation process.</p></li><li><p>Last, but not least, <code class="computeroutput">TempReg</code>. The field contains the number of one of an infinite set of virtual (integer) registers. <code class="computeroutput">TempReg</code>s are used everywhere throughout the translation process; you can have as many as you want. The register allocator maps as many as it can into <code class="computeroutput">RealReg</code>s and turns the rest into <code class="computeroutput">SpillNo</code>s, so <code class="computeroutput">TempReg</code>s should not exist after the register allocation phase.</p><p><code class="computeroutput">TempReg</code>s are always 32 bits long, even if the data they hold is logically shorter. In that case the upper unused bits are required, and, I think, generally assumed, to be zero. <code class="computeroutput">TempReg</code>s holding V bits for quantities shorter than 32 bits are expected to have ones in the unused places, since a one denotes "undefined".</p></li></ul></div></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="mc-tech-docs.uinstr"></a>1.2.5.燯Code instructions: type <code class="computeroutput">UInstr</code></h3></div></div></div><p>UCode was carefully designed to make it possible to doregister allocation on UCode and then translate the result intox86 code without needing any extra registers ... well, that wasthe original plan, anyway. Things have gotten a little morecomplicated since then. In what follows, UCode instructions arereferred to as uinstrs, to distinguish them from x86instructions. Uinstrs of course have uopcodes which are(naturally) different from x86 opcodes.</p><p>A uinstr (type <code class="computeroutput">UInstr</code>)contains various fields, not all of which are used by any oneuopcode:</p><div class="itemizedlist"><ul type="disc"><li><p>Three 16-bit operand fields, <code class="computeroutput">val1</code>, <code class="computeroutput">val2</code> and <code class="computeroutput">val3</code>.</p></li><li><p>Three tag fields, <code class="computeroutput">tag1</code>, <code class="computeroutput">tag2</code> and <code class="computeroutput">tag3</code>. Each of these has a value of type <code class="computeroutput">Tag</code>, and they describe what the <code class="computeroutput">val1</code>, <code class="computeroutput">val2</code> and <code class="computeroutput">val3</code> fields contain.</p></li><li><p>A 32-bit literal field.</p></li><li><p>Two <code class="computeroutput">FlagSet</code>s, specifying which x86 condition codes are read and written by the uinstr.</p></li><li><p>An opcode byte, containing a value of type <code class="computeroutput">Opcode</code>.</p></li><li><p>A size field, indicating the data transfer size (1/2/4/8/10) in cases where this makes sense, or zero otherwise.</p></li><li><p>A condition-code field, which, for jumps, holds a value of type <code class="computeroutput">Condcode</code>, indicating
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -