📄 mc-tech-docs.html
字号:
malloc/free scheme augmented with arenas, and <code class="filename">vg_mylibc.c</code> exports reimplementations of various bits and pieces you'd normally get from the C library.</p><p>Why all the hassle? Because imagine the potential chaos of both the simulated and real CPUs executing in <code class="filename">glibc.so</code>. It just seems simpler and cleaner to be completely self-contained, so that only the simulated CPU visits <code class="filename">glibc.so</code>. In practice it's not much hassle anyway. Also, valgrind starts up before glibc has a chance to initialise itself, and who knows what difficulties that could lead to. Finally, glibc has definitions for some types, specifically <code class="computeroutput">sigset_t</code>, which conflict (are different from) the Linux kernel's idea of same. When Valgrind wants to fiddle around with signal stuff, it wants to use the kernel's definitions, not glibc's definitions. So it's simplest just to keep glibc out of the picture entirely.</p><p>To find out which glibc symbols are used by Valgrind, reinstate the link flags <code class="computeroutput">-nostdlib -Wl,-no-undefined</code>. This causes linking to fail, but will tell you what you depend on. I have mostly, but not entirely, got rid of the glibc dependencies; what remains is, IMO, fairly harmless. AFAIK the current dependencies are: <code class="computeroutput">memset</code>, <code class="computeroutput">memcmp</code>, <code class="computeroutput">stat</code>, <code class="computeroutput">system</code>, <code class="computeroutput">sbrk</code>, <code class="computeroutput">setjmp</code> and <code class="computeroutput">longjmp</code>.</p></li><li><p>Similarly, valgrind should not really import any headers other than the Linux kernel headers, since it knows of no API other than the kernel interface to talk to. At the moment this is really not in a good state, and <code class="computeroutput">vg_syscall_mem</code> imports, via <code class="filename">vg_unsafe.h</code>, a significant number of C-library headers so as to know the sizes of various structs passed across the kernel boundary. This is of course completely bogus, since there is no guarantee that the C library's definitions of these structs matches those of the kernel. I have started to sort this out using <code class="filename">vg_kerneliface.h</code>, into which I had intended to copy all kernel definitions which valgrind could need, but this has not gotten very far. At the moment it mostly contains definitions for <code class="computeroutput">sigset_t</code> and <code class="computeroutput">struct sigaction</code>, since the kernel's definition for these really does clash with glibc's. I plan to use a <code class="computeroutput">vki_</code> prefix on all these types and constants, to denote the fact that they pertain to <span><strong class="command">V</strong></span>algrind's <span><strong class="command">K</strong></span>ernel <span><strong class="command">I</strong></span>nterface.</p><p>Another advantage of having a <code class="filename">vg_kerneliface.h</code> file is that it makes it simpler to interface to a different kernel. Once can, for example, easily imagine writing a new <code class="filename">vg_kerneliface.h</code> for FreeBSD, or x86 NetBSD.</p></li></ul></div></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="mc-tech-docs.limits"></a>1.1.5.燙urrent limitations</h3></div></div></div><p>Support for weird (non-POSIX) signal stuff is patchy. Doesanybody care?</p></div></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="mc-tech-docs.jitter"></a>1.2.燭he instrumenting JITter</h2></div></div></div><p>This really is the heart of the matter. We begin withvarious side issues.</p><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="mc-tech-docs.storage"></a>1.2.1.燫un-time storage, and the use of host registers</h3></div></div></div><p>Valgrind translates client (original) basic blocks intoinstrumented basic blocks, which live in the translation cacheTC, until either the client finishes or the translations areejected from TC to make room for newer ones.</p><p>Since it generates x86 code in memory, Valgrind hascomplete control of the use of registers in the translations.Now pay attention. I shall say this only once, and it isimportant you understand this. In what follows I will refer toregisters in the host (real) cpu using their standard names,<code class="computeroutput">%eax</code>,<code class="computeroutput">%edi</code>, etc. I refer to registersin the simulated CPU by capitalising them:<code class="computeroutput">%EAX</code>,<code class="computeroutput">%EDI</code>, etc. These two sets ofregisters usually bear no direct relationship to each other;there is no fixed mapping between them. This naming scheme isused fairly consistently in the comments in the sources.</p><p>Host registers, once things are up and running, are used asfollows:</p><div class="itemizedlist"><ul type="disc"><li><p><code class="computeroutput">%esp</code>, the real stack pointer, points somewhere in Valgrind's private stack area, <code class="computeroutput">VG_(stack)</code> or, transiently, into its signal delivery stack, <code class="computeroutput">VG_(sigstack)</code>.</p></li><li><p><code class="computeroutput">%edi</code> is used as a temporary in code generation; it is almost always dead, except when used for the <code class="computeroutput">Left</code> value-tag operations.</p></li><li><p><code class="computeroutput">%eax</code>, <code class="computeroutput">%ebx</code>, <code class="computeroutput">%ecx</code>, <code class="computeroutput">%edx</code> and <code class="computeroutput">%esi</code> are available to Valgrind's register allocator. They are dead (carry unimportant values) in between translations, and are live only in translations. The one exception to this is <code class="computeroutput">%eax</code>, which, as mentioned far above, has a special significance to the dispatch loop <code class="computeroutput">VG_(dispatch)</code>: when a translation returns to the dispatch loop, <code class="computeroutput">%eax</code> is expected to contain the original-code-address of the next translation to run. The register allocator is so good at minimising spill code that using five regs and not having to save/restore <code class="computeroutput">%edi</code> actually gives better code than allocating to <code class="computeroutput">%edi</code> as well, but then having to push/pop it around special uses.</p></li><li><p><code class="computeroutput">%ebp</code> points permanently at <code class="computeroutput">VG_(baseBlock)</code>. Valgrind's translations are position-independent, partly because this is convenient, but also because translations get moved around in TC as part of the LRUing activity. <span><strong class="command">All</strong></span> static entities which need to be referred to from generated code, whether data or helper functions, are stored starting at <code class="computeroutput">VG_(baseBlock)</code> and are therefore reached by indexing from <code class="computeroutput">%ebp</code>. There is but one exception, which is that by placing the value <code class="computeroutput">VG_EBP_DISPATCH_CHECKED</code> in <code class="computeroutput">%ebp</code> just before a return to the dispatcher, the dispatcher is informed that the next address to run, in <code class="computeroutput">%eax</code>, requires special treatment.</p></li><li><p>The real machine's FPU state is pretty much unimportant, for reasons which will become obvious. Ditto its <code class="computeroutput">%eflags</code> register.</p></li></ul></div><p>The state of the simulated CPU is stored in memory, in<code class="computeroutput">VG_(baseBlock)</code>, which is a blockof 200 words IIRC. Recall that<code class="computeroutput">%ebp</code> points permanently at thestart of this block. Function<code class="computeroutput">vg_init_baseBlock</code> decides whatthe offsets of various entities in<code class="computeroutput">VG_(baseBlock)</code> are to be, andallocates word offsets for them. The code generator then emits<code class="computeroutput">%ebp</code> relative addresses to getat those things. The sequence in which entities are allocatedhas been carefully chosen so that the 32 most popular entitiescome first, because this means 8-bit offsets can be used in thegenerated code.</p><p>If I was clever, I could make<code class="computeroutput">%ebp</code> point 32 words along<code class="computeroutput">VG_(baseBlock)</code>, so that I'd haveanother 32 words of short-form offsets available, but that's justcomplicated, and it's not important -- the first 32 words take99% (or whatever) of the traffic.</p><p>Currently, the sequence of stuff in<code class="computeroutput">VG_(baseBlock)</code> is asfollows:</p><div class="itemizedlist"><ul type="disc"><li><p>9 words, holding the simulated integer registers, <code class="computeroutput">%EAX</code> .. <code class="computeroutput">%EDI</code>, and the simulated flags, <code class="computeroutput">%EFLAGS</code>.</p></li><li><p>Another 9 words, holding the V bit "shadows" for the above 9 regs.</p></li><li><p>The <span><strong class="command">addresses</strong></span> of various helper routines called from generated code: <code class="computeroutput">VG_(helper_value_check4_fail)</code>, <code class="computeroutput">VG_(helper_value_check0_fail)</code>, which register V-check failures, <code class="computeroutput">VG_(helperc_STOREV4)</code>, <code class="computeroutput">VG_(helperc_STOREV1)</code>, <code class="computeroutput">VG_(helperc_LOADV4)</code>, <code class="computeroutput">VG_(helperc_LOADV1)</code>, which do stores and loads of V bits to/from the sparse array which keeps track of V bits in memory, and <code class="computeroutput">VGM_(handle_esp_assignment)</code>, which messes with memory addressibility resulting from changes in <code class="computeroutput">%ESP</code>.</p></li><li><p>The simulated <code class="computeroutput">%EIP</code>.</p></li><li><p>24 spill words, for when the register allocator can't make it work with 5 measly registers.</p></li><li><p>Addresses of helpers <code class="computeroutput">VG_(helperc_STOREV2)</code>, <code class="computeroutput">VG_(helperc_LOADV2)</code>. These are here because 2-byte loads and stores are relatively rare, so are placed above the magic 32-word offset boundary.</p></li><li><p>For similar reasons, addresses of helper functions <code class="computeroutput">VGM_(fpu_write_check)</code> and <code class="computeroutput">VGM_(fpu_read_check)</code>, which handle the A/V maps testing and changes required by FPU writes/reads.</p></li><li><p>Some other boring helper addresses: <code class="computeroutput">VG_(helper_value_check2_fail)</code> and <code class="computeroutput">VG_(helper_value_check1_fail)</code>. These are probably never emitted now, and should be removed.</p></li><li><p>The entire state of the simulated FPU, which I believe to be 108 bytes long.</p></li><li><p>Finally, the addresses of various other helper functions in <code class="filename">vg_helpers.S</code>, which deal with rare situations which are tedious or difficult to generate code in-line for.</p></li></ul></div><p>As a general rule, the simulated machine's state livespermanently in memory at<code class="computeroutput">VG_(baseBlock)</code>. However, theJITter does some optimisations which allow the simulated integerregisters to be cached in real registers over multiple simulatedinstructions within the same basic block. These are always
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -