📄 dtrace_impl.h

📁 Sun Solaris 10 中的 DTrace 组件的源代码。请参看: http://www.sun.com/software/solaris/observability.jsp
💻 H
📖 第 1 页 / 共 4 页
字号:
 * dtrace_probe() loop would have to be careful to not call any further DIF * emulation while the variable is locked to avoid deadlock.  More generally, * if one were to implement (1), DIF emulation code dealing with dynamic * variables could only deal with one dynamic variable at a time (lest deadlock * result).  To sum, (1) exports too much subtlety to the users of dynamic * variables -- increasing maintenance burden and imposing serious constraints * on future DTrace development. * * The implementation of (2) is also complex, but the complexity is more * manageable.  We need to be sure that when a variable is deallocated, it is * not placed on a traditional free list, but rather on a _dirty_ list.  Once a * variable is on a dirty list, it cannot be found by CPUs performing a * subsequent lookup of the variable -- but it may still be in use by other * CPUs.  To assure that all CPUs that may be seeing the old variable have * cleared out of probe context, a dtrace_sync() can be issued.  Once the * dtrace_sync() has completed, it can be known that all CPUs are done * manipulating the dynamic variable -- the dirty list can be atomically * appended to the free list.  Unfortunately, there's a slight hiccup in this * mechanism:  dtrace_sync() may not be issued from probe context.  The * dtrace_sync() must be therefore issued asynchronously from non-probe * context.  For this we rely on the DTrace cleaner, a cyclic that runs at the * "cleanrate" frequency.  To ease this implementation, we define several chunk * lists: * *   - Dirty.  Deallocated chunks, not yet cleaned.  Not available. * *   - Rinsing.  Formerly dirty chunks that are currently being asynchronously *     cleaned.  Not available, but will be shortly.  Dynamic variable *     allocation may not spin or block for availability, however. * *   - Clean.  Clean chunks, ready for allocation -- but not on the free list. * *   - Free.  Available for allocation. * * Moreover, to avoid absurd contention, _each_ of these lists is implemented * on a per-CPU basis.  This is only for performance, not correctness; chunks * may be allocated from another CPU's free list.  The algorithm for allocation * then is this: * *   (1)  Attempt to atomically allocate from current CPU's free list.  If list *        is non-empty and allocation is successful, allocation is complete. * *   (2)  If the clean list is non-empty, atomically move it to the free list, *        and reattempt (1). * *   (3)  If the dynamic variable space is in the CLEAN state, look for free *        and clean lists on other CPUs by setting the current CPU to the next *        CPU, and reattempting (1).  If the next CPU is the current CPU (that *        is, if all CPUs have been checked), atomically switch the state of *        the dynamic variable space based on the following: * *        - If no free chunks were found and no dirty chunks were found, *          atomically set the state to EMPTY. * *        - If dirty chunks were found, atomically set the state to DIRTY. * *        - If rinsing chunks were found, atomically set the state to RINSING. * *   (4)  Based on state of dynamic variable space state, increment appropriate *        counter to indicate dynamic drops (if in EMPTY state) vs. dynamic *        dirty drops (if in DIRTY state) vs. dynamic rinsing drops (if in *        RINSING state).  Fail the allocation. * * The cleaning cyclic operates with the following algorithm:  for all CPUs * with a non-empty dirty list, atomically move the dirty list to the rinsing * list.  Perform a dtrace_sync().  For all CPUs with a non-empty rinsing list, * atomically move the rinsing list to the clean list.  Perform another * dtrace_sync().  By this point, all CPUs have seen the new clean list; the * state of the dynamic variable space can be restored to CLEAN. * * There exist two final races that merit explanation.  The first is a simple * allocation race: * *                 CPU A                                 CPU B *  +---------------------------------+   +---------------------------------+ *  |                                 |   |                                 | *  | allocates dynamic object a[123] |   | allocates dynamic object a[123] | *  | by storing the value 345 to it  |   | by storing the value 567 to it  | *  |                                 |   |                                 | *  :                                 :   :                                 : *  .                                 .   .                                 . * * Again, this is a race in the D program.  It can be resolved by having a[123] * hold the value 345 or a[123] hold the value 567 -- but it must be true that * a[123] have only _one_ of these values.  (That is, the racing CPUs may not * put the same element twice on the same hash chain.)  This is resolved * simply:  before the allocation is undertaken, the start of the new chunk's * hash chain is noted.  Later, after the allocation is complete, the hash * chain is atomically switched to point to the new element.  If this fails * (because of either concurrent allocations or an allocation concurrent with a * deletion), the newly allocated chunk is deallocated to the dirty list, and * the whole process of looking up (and potentially allocating) the dynamic * variable is reattempted. * * The final race is a simple deallocation race: * *                 CPU A                                 CPU B *  +---------------------------------+   +---------------------------------+ *  |                                 |   |                                 | *  | deallocates dynamic object      |   | deallocates dynamic object      | *  | a[123] by storing the value 0   |   | a[123] by storing the value 0   | *  | to it                           |   | to it                           | *  |                                 |   |                                 | *  :                                 :   :                                 : *  .                                 .   .                                 . * * Once again, this is a race in the D program, but it is one that we must * handle without corrupting the underlying data structures.  Because * deallocations require the deletion of a chunk from the middle of a hash * chain, we cannot use a single-word atomic operation to remove it.  For this, * we add a spin lock to the hash buckets that is _only_ used for deallocations * (allocation races are handled as above).  Further, this spin lock is _only_ * held for the duration of the delete; before control is returned to the DIF * emulation code, the hash bucket is unlocked. */typedef struct dtrace_key {	uint64_t dttk_value;			/* data value or data pointer */	uint64_t dttk_size;			/* 0 if by-val, >0 if by-ref */} dtrace_key_t;typedef struct dtrace_tuple {	uint32_t dtt_nkeys;			/* number of keys in tuple */	uint32_t dtt_pad;			/* padding */	dtrace_key_t dtt_key[1];		/* array of tuple keys */} dtrace_tuple_t;typedef struct dtrace_dynvar {	uint64_t dtdv_hashval;			/* hash value -- 0 if free */	struct dtrace_dynvar *dtdv_next;	/* next on list or hash chain */	void *dtdv_data;			/* pointer to data */	dtrace_tuple_t dtdv_tuple;		/* tuple key */} dtrace_dynvar_t;typedef enum dtrace_dynvar_op {	DTRACE_DYNVAR_ALLOC,	DTRACE_DYNVAR_NOALLOC,	DTRACE_DYNVAR_DEALLOC} dtrace_dynvar_op_t;typedef struct dtrace_dynhash {	dtrace_dynvar_t *dtdh_chain;		/* hash chain for this bucket */	uintptr_t dtdh_lock;			/* deallocation lock */#ifdef _LP64	uintptr_t dtdh_pad[6];			/* pad to avoid false sharing */#else	uintptr_t dtdh_pad[14];			/* pad to avoid false sharing */#endif} dtrace_dynhash_t;typedef struct dtrace_dstate_percpu {	dtrace_dynvar_t *dtdsc_free;		/* free list for this CPU */	dtrace_dynvar_t *dtdsc_dirty;		/* dirty list for this CPU */	dtrace_dynvar_t *dtdsc_rinsing;		/* rinsing list for this CPU */	dtrace_dynvar_t *dtdsc_clean;		/* clean list for this CPU */	uint64_t dtdsc_drops;			/* number of capacity drops */	uint64_t dtdsc_dirty_drops;		/* number of dirty drops */	uint64_t dtdsc_rinsing_drops;		/* number of rinsing drops */#ifdef _LP64	uint64_t dtdsc_pad;			/* pad to avoid false sharing */#else	uint64_t dtdsc_pad[2];			/* pad to avoid false sharing */#endif} dtrace_dstate_percpu_t;typedef enum dtrace_dstate_state {	DTRACE_DSTATE_CLEAN = 0,	DTRACE_DSTATE_EMPTY,	DTRACE_DSTATE_DIRTY,	DTRACE_DSTATE_RINSING} dtrace_dstate_state_t;typedef struct dtrace_dstate {	void *dtds_base;			/* base of dynamic var. space */	size_t dtds_size;			/* size of dynamic var. space */	size_t dtds_hashsize;			/* number of buckets in hash */	size_t dtds_chunksize;			/* size of each chunk */	dtrace_dynhash_t *dtds_hash;		/* pointer to hash table */	dtrace_dstate_state_t dtds_state;	/* current dynamic var. state */	dtrace_dstate_percpu_t *dtds_percpu;	/* per-CPU dyn. var. state */} dtrace_dstate_t;/* * DTrace Variable State * * The DTrace variable state tracks user-defined variables in its dtrace_vstate * structure.  Each DTrace consumer has exactly one dtrace_vstate structure, * but some dtrace_vstate structures may exist without a corresponding DTrace * consumer (see "DTrace Helpers", below).  As described in <sys/dtrace.h>, * user-defined variables can have one of three scopes: * *  DIFV_SCOPE_GLOBAL  =>  global scope *  DIFV_SCOPE_THREAD  =>  thread-local scope (i.e. "self->" variables) *  DIFV_SCOPE_LOCAL   =>  clause-local scope (i.e. "this->" variables) * * The variable state tracks variables by both their scope and their allocation * type: * *  - The dtvs_globals member points to an array of dtrace_globvar structures. *    These structures contain both the variable metadata (dtrace_difv *    structures) and the underlying storage for all statically allocated *    DIFV_SCOPE_GLOBAL variables. * *  - The dtvs_tlocals member points to an array of dtrace_difv structures for *    DIFV_SCOPE_THREAD variables.  As such, this array tracks _only_ the *    variable metadata for DIFV_SCOPE_THREAD variables; the underlying storage *    is allocated out of the dynamic variable space. * *  - The dtvs_locals member points to an array of uint64_t's that represent *    the underlying storage for DIFV_SCOPE_LOCAL variables.  As *    DIFV_SCOPE_LOCAL variables may only be scalars, there is no need to store *    any variable metadata other than the number of clause-local variables. * *  - The dtvs_dynvars member is the dynamic variable state associated with the *    variable state.  The dynamic variable state (described in "DTrace Dynamic *    Variables", above) tracks all DIFV_SCOPE_THREAD variables and all *    dynamically-allocated DIFV_SCOPE_GLOBAL variables. */typedef struct dtrace_globvar {	uint64_t dtgv_data;			/* data or pointer to it */	int dtgv_refcnt;			/* reference count */	dtrace_difv_t dtgv_var;			/* variable metadata */} dtrace_globvar_t;typedef struct dtrace_vstate {	dtrace_globvar_t **dtvs_globals;	/* statically-allocated glbls */	int dtvs_nglobals;			/* number of globals */	dtrace_difv_t *dtvs_tlocals;		/* thread-local metadata */	int dtvs_ntlocals;			/* number of thread-locals */	uint64_t **dtvs_locals;			/* clause-local data */	int dtvs_nlocals;			/* number of clause-locals */	dtrace_dstate_t dtvs_dynvars;		/* dynamic variable state */} dtrace_vstate_t;/* * DTrace Machine State * * In the process of processing a fired probe, DTrace needs to track and/or * cache some per-CPU state associated with that particular firing.  This is * state that is always discarded after the probe firing has completed, and * much of it is not specific to any DTrace consumer, remaining valid across * all ECBs.  This state is tracked in the dtrace_mstate structure. */#define	DTRACE_MSTATE_ARGS		0x00000001#define	DTRACE_MSTATE_PROBE		0x00000002#define	DTRACE_MSTATE_EPID		0x00000004#define	DTRACE_MSTATE_TIMESTAMP		0x00000008#define	DTRACE_MSTATE_STACKDEPTH	0x00000010#define	DTRACE_MSTATE_CALLER		0x00000020#define	DTRACE_MSTATE_IPL		0x00000040#define	DTRACE_MSTATE_FLTOFFS		0x00000080#define	DTRACE_MSTATE_WALLTIMESTAMP	0x00000100typedef struct dtrace_mstate {	uintptr_t dtms_scratch_base;		/* base of scratch space */	uintptr_t dtms_scratch_ptr;		/* current scratch pointer */	size_t dtms_scratch_size;		/* scratch size */	uint32_t dtms_present;			/* variables that are present */	uint64_t dtms_arg[5];			/* cached arguments */	dtrace_epid_t dtms_epid;		/* current EPID */	uint64_t dtms_timestamp;		/* cached timestamp */	hrtime_t dtms_walltimestamp;		/* cached wall timestamp */	int dtms_stackdepth;			/* cached stackdepth */	struct dtrace_probe *dtms_probe;	/* current probe */	uintptr_t dtms_caller;			/* cached caller */	int dtms_ipl;				/* cached interrupt pri lev */	int dtms_fltoffs;			/* faulting DIFO offset */} dtrace_mstate_t;#define	DTRACE_COND_OWNER	0x1#define	DTRACE_COND_USERMODE	0x2#define	DTRACE_PROBEKEY_MAXDEPTH	8	/* max glob recursion depth *//* * DTrace Activity * * Each DTrace consumer is in one of several states, which (for purposes of * avoiding yet-another overloading of the noun "state") we call the current * _activity_.  The activity transitions on dtrace_go() (from DTRACIOCGO), on * dtrace_stop() (from DTRACIOCSTOP) and on the exit() action.  Activities may * only transition in one direction; the activity transition diagram is a * directed acyclic graph.  The activity transition diagram is as follows: * * * +----------+                   +--------+                   +--------+ * | INACTIVE |------------------>| WARMUP |------------------>| ACTIVE | * +----------+   dtrace_go(),    +--------+   dtrace_go(),    +--------+ *                before BEGIN        |        after BEGIN       |  |  | *                                    |                          |  |  | *                      exit() action |                          |  |  | *                     from BEGIN ECB |                          |  |  | *                                    |                          |  |  | *                                    v                          |  |  | *                               +----------+     exit() action  |  |  | *                               | DRAINING |<-------------------+  |  | *                               +----------+                       |  | *                                    |                             |  | *                     dtrace_stop(), |                             |  | *                       before END   |                             |  | *                                    |                             |  | *                                    v                             |  | * +---------+                   +----------+                       |  | * | STOPPED |<------------------| COOLDOWN |<----------------------+  | * +---------+   dtrace_stop(),  +----------+     dtrace_stop(),       | *                 after END                       before END          | *                                                                     | *                                +--------+                           | *                                | KILLED |<--------------------------+ *                                +--------+     deadman timeout * * Note that once a DTrace consumer has stopped tracing, there is no way to
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -