📄 mc-tech-docs.html
字号:
<html xmlns:cf="http://docbook.sourceforge.net/xmlns/chunkfast/1.0"><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>1.燭he Design and Implementation of Valgrind</title><link rel="stylesheet" href="vg_basic.css" type="text/css"><meta name="generator" content="DocBook XSL Stylesheets V1.69.0"><link rel="start" href="index.html" title="Valgrind Documentation"><link rel="up" href="tech-docs.html" title="Valgrind Technical Documentation"><link rel="prev" href="tech-docs.html" title="Valgrind Technical Documentation"><link rel="next" href="cg-tech-docs.html" title="2.燞ow Cachegrind works"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div><table class="nav" width="100%" cellspacing="3" cellpadding="3" border="0" summary="Navigation header"><tr><td width="22px" align="center" valign="middle"><a accesskey="p" href="tech-docs.html"><img src="images/prev.png" width="18" height="21" border="0" alt="Prev"></a></td><td width="25px" align="center" valign="middle"><a accesskey="u" href="tech-docs.html"><img src="images/up.png" width="21" height="18" border="0" alt="Up"></a></td><td width="31px" align="center" valign="middle"><a accesskey="h" href="index.html"><img src="images/home.png" width="27" height="20" border="0" alt="Up"></a></td><th align="center" valign="middle">Valgrind Technical Documentation</th><td width="22px" align="center" valign="middle"><a accesskey="n" href="cg-tech-docs.html"><img src="images/next.png" width="18" height="21" border="0" alt="Next"></a></td></tr></table></div><div class="chapter" lang="en"><div class="titlepage"><div><div><h2 class="title"><a name="mc-tech-docs"></a>1.燭he Design and Implementation of Valgrind</h2></div><div><h3 class="subtitle"><i>Detailed technical notes for hackers, maintainers and the overly-curious</i></h3></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="sect1"><a href="mc-tech-docs.html#mc-tech-docs.intro">1.1. Introduction</a></span></dt><dd><dl><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.history">1.1.1. History</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.overview">1.1.2. Design overview</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.design">1.1.3. Design decisions</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.correctness">1.1.4. Correctness</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.limits">1.1.5. Current limitations</a></span></dt></dl></dd><dt><span class="sect1"><a href="mc-tech-docs.html#mc-tech-docs.jitter">1.2. The instrumenting JITter</a></span></dt><dd><dl><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.storage">1.2.1. Run-time storage, and the use of host registers</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.startup">1.2.2. Startup, shutdown, and system calls</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.ucode">1.2.3. Introduction to UCode</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.tags">1.2.4. UCode operand tags: type <code class="computeroutput">Tag</code></a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.uinstr">1.2.5. UCode instructions: type <code class="computeroutput">UInstr</code></a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.trans">1.2.6. Translation into UCode</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.optim">1.2.7. UCode optimisation</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.instrum">1.2.8. UCode instrumentation</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.cleanup">1.2.9. UCode post-instrumentation cleanup</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.transfrom">1.2.10. Translation from UCode</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.dispatch">1.2.11. Top-level dispatch loop</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.lazy">1.2.12. Lazy updates of the simulated program counter</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.signals">1.2.13. Signals</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.todo">1.2.14. To be written</a></span></dt></dl></dd><dt><span class="sect1"><a href="mc-tech-docs.html#mc-tech-docs.extensions">1.3. Extensions</a></span></dt><dd><dl><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.bugs">1.3.1. Bugs</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.threads">1.3.2. Threads</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.verify">1.3.3. Verification suite</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.porting">1.3.4. Porting to other platforms</a></span></dt></dl></dd><dt><span class="sect1"><a href="mc-tech-docs.html#mc-tech-docs.easystuff">1.4. Easy stuff which ought to be done</a></span></dt><dd><dl><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.mmx">1.4.1. MMX Instructions</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.fixstabs">1.4.2. Fix stabs-info reader</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.x86instr">1.4.3. BT/BTC/BTS/BTR</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.prefetch">1.4.4. Using PREFETCH Instructions</a></span></dt><dt><span class="sect2"><a href="mc-tech-docs.html#mc-tech-docs.pranges">1.4.5. User-defined Permission Ranges</a></span></dt></dl></dd></dl></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="mc-tech-docs.intro"></a>1.1.營ntroduction</h2></div></div></div><p>This document contains a detailed, highly-technical description ofthe internals of Valgrind. This is not the user manual; if you are anend-user of Valgrind, you do not want to read this. Conversely, if youreally are a hacker-type and want to know how it works, I assume thatyou have read the user manual thoroughly.</p><p>You may need to read this document several times, and carefully.Some important things, I only say once.</p><p>[Note: this document is now very old, and a lot of its contentsare out of date, and misleading.]</p><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="mc-tech-docs.history"></a>1.1.1.燞istory</h3></div></div></div><p>Valgrind came into public view in late Feb 2002. However, it hasbeen under contemplation for a very long time, perhaps seriously forabout five years. Somewhat over two years ago, I started working on thex86 code generator for the Glasgow Haskell Compiler(http://www.haskell.org/ghc), gaining familiarity with x86 internals onthe way. I then did Cacheprof, gaining further x86 experience. Sometime around Feb 2000 I started experimenting with a user-space x86interpreter for x86-Linux. This worked, but it was clear that aJIT-based scheme would be necessary to give reasonable performance forValgrind. Design work for the JITter started in earnest in Oct 2000,and by early 2001 I had an x86-to-x86 dynamic translator which could runquite large programs. This translator was in a sense pointless, sinceit did not do any instrumentation or checking.</p><p>Most of the rest of 2001 was taken up designing and implementingthe instrumentation scheme. The main difficulty, which consumed a lotof effort, was to design a scheme which did not generate large numbersof false uninitialised-value warnings. By late 2001 a satisfactoryscheme had been arrived at, and I started to test it on ever-largerprograms, with an eventual eye to making it work well enough so that itwas helpful to folks debugging the upcoming version 3 of KDE. I've usedKDE since before version 1.0, and wanted to Valgrind to be an indirectcontribution to the KDE 3 development effort. At the start of Feb 02the kde-core-devel crew started using it, and gave a huge amount ofhelpful feedback and patches in the space of three weeks. Snapshot20020306 is the result.</p><p>In the best Unix tradition, or perhaps in the spirit of FredBrooks' depressing-but-completely-accurate epitaph "build one to throwaway; you will anyway", much of Valgrind is a second or third renditionof the initial idea. The instrumentation machinery(<code class="filename">vg_translate.c</code>, <code class="filename">vg_memory.c</code>)and core CPU simulation (<code class="filename">vg_to_ucode.c</code>,<code class="filename">vg_from_ucode.c</code>) have had three redesigns andrewrites; the register allocator, low-level memory manager(<code class="filename">vg_malloc2.c</code>) and symbol table reader(<code class="filename">vg_symtab2.c</code>) are on the second rewrite. In asense, this document serves to record some of the knowledge gained as aresult.</p></div><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="mc-tech-docs.overview"></a>1.1.2.燚esign overview</h3></div></div></div><p>Valgrind is compiled into a Linux shared object,<code class="filename">valgrind.so</code>, and also a dummy one,<code class="filename">valgrinq.so</code>, of which more later. The<code class="filename">valgrind</code> shell script adds<code class="filename">valgrind.so</code> to the<code class="computeroutput">LD_PRELOAD</code> list of extra libraries tobe loaded with any dynamically linked library. This is a standardtrick, one which I assume the<code class="computeroutput">LD_PRELOAD</code> mechanism was developed tosupport.</p><p><code class="filename">valgrind.so</code> is linked with the<code class="computeroutput">-z initfirst</code> flag, whichrequests that its initialisation code is run before that of anyother object in the executable image. When this happens,valgrind gains control. The real CPU becomes "trapped" in<code class="filename">valgrind.so</code> and the translations itgenerates. The synthetic CPU provided by Valgrind does, however,return from this initialisation function. So the normal startupactions, orchestrated by the dynamic linker<code class="filename">ld.so</code>, continue as usual, except on thesynthetic CPU, not the real one. Eventually<code class="function">main</code> is run and returns, andthen the finalisation code of the shared objects is run,presumably in inverse order to which they were initialised.Remember, this is still all happening on the simulated CPU.Eventually <code class="filename">valgrind.so</code>'s own finalisationcode is called. It spots this event, shuts down the simulatedCPU, prints any error summaries and/or does leak detection, andreturns from the initialisation code on the real CPU. At thispoint, in effect the real and synthetic CPUs have merged backinto one, Valgrind has lost control of the program, and theprogram finally <code class="function">exit()s</code> back tothe kernel in the usual way.</p><p>The normal course of activity, once Valgrind has startedup, is as follows. Valgrind never runs any part of your program(usually referred to as the "client"), not a single byte of it,directly. Instead it uses function<code class="function">VG_(translate)</code> to translatebasic blocks (BBs, straight-line sequences of code) intoinstrumented translations, and those are run instead. Thetranslations are stored in the translation cache (TC),<code class="computeroutput">vg_tc</code>, with the translationtable (TT), <code class="computeroutput">vg_tt</code> supplying theoriginal-to-translation code address mapping. Auxiliary array<code class="computeroutput">VG_(tt_fast)</code> is used as adirect-map cache for fast lookups in TT; it usually achieves ahit rate of around 98% and facilitates an orig-to-trans lookup in4 x86 insns, which is not bad.</p><p>Function <code class="function">VG_(dispatch)</code> in<code class="filename">vg_dispatch.S</code> is the heart of the JITdispatcher. Once a translated code address has been found, it isexecuted simply by an x86 <code class="computeroutput">call</code>to the translation. At the end of the translation, the nextoriginal code addr is loaded into<code class="computeroutput">%eax</code>, and the translation thendoes a <code class="computeroutput">ret</code>, taking it back tothe dispatch loop, with, interestingly, zero branchmispredictions. The address requested in<code class="computeroutput">%eax</code> is looked up first in<code class="function">VG_(tt_fast)</code>, and, if not found,by calling C helper<code class="function">VG_(search_transtab)</code>. If thereis still no translation available,<code class="function">VG_(dispatch)</code> exits back to thetop-level C dispatcher<code class="function">VG_(toploop)</code>, which arranges for<code class="function">VG_(translate)</code> to make a newtranslation. All fairly unsurprising, really. There are variouscomplexities described below.</p><p>The translator, orchestrated by<code class="function">VG_(translate)</code>, is complicatedbut entirely self-contained. It is described in great detail insubsequent sections. Translations are stored in TC, with TTtracking administrative information. The translations aresubject to an approximate LRU-based management scheme. With thecurrent settings, the TC can hold at most about 15MB oftranslations, and LRU passes prune it to about 13.5MB. Giventhat the orig-to-translation expansion ratio is about 13:1 to14:1, this means TC holds translations for more or less amegabyte of original code, which generally comes to about 70000basic blocks for C++ compiled with optimisation on. Generatingnew translations is expensive, so it is worth having a large TCto minimise the (capacity) miss rate.</p><p>The dispatcher,<code class="function">VG_(dispatch)</code>, receives hintsfrom the translations which allow it to cheaply spot all controltransfers corresponding to x86<code class="computeroutput">call</code> and<code class="computeroutput">ret</code> instructions. It has to dothis in order to spot some special events:</p><div class="itemizedlist"><ul type="disc"><li><p>Calls to <code class="function">VG_(shutdown)</code>. This is Valgrind's cue to exit. NOTE: actually this is done a different way; it should be cleaned up.</p></li><li><p>Returns of system call handlers, to the return address <code class="function">VG_(signalreturn_bogusRA)</code>. The signal simulator needs to know when a signal handler is returning, so we spot jumps (returns) to this address.</p></li><li><p>Calls to <code class="function">vg_trap_here</code>.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -