📄 mpid_onesided.h
字号:
* Find element in used list that "matches" according to * 'func'('el', ...). 'func' is called with arbitrary parameter 'el' * and pointer to element under test. Only one element is found, * always the first "match". 'func' returns 0 for match (success). * * Returns NULL if no match found. * If 'parent' is not NULL, returns pointer to parent element there. * Note, '*parent' == NULL means element is first in list. * * \param[in] qhead Queue Head * \param[in] func Function to use to test for desired element * \param[in] v3 void arg passed to \e func in 3rd arg * \param[in] el Static first parameter for 'func' * \param[in,out] parent Pointer to parent element to start search from; * Pointer to parent element of match found, * or NULL if 'el' is at top of queue. * \return Pointer to element found with 'parent' set, * or NULL if not found. * * \ref rsrc_design */void *MPIDU_find_element(struct mpid_qhead *qhead, int (*func)(void *, void *, void *), void *v3, void *el, struct mpid_element **parent);/* * * * * * Win Locks and Lock wait queue * * * * * *//** * \brief Progress (advance) wait for window lock to be released * * Adds a dummy waiter to the lock wait queue, so ensure that * unlock will eventually give us a chance. * * Called from various epoch-start code to ensure no other node is * accessing our window while we are in another epoch. * * \todo Probably sohuld assert that the popped waiter, * if any, was our NULL one. * * \param[in] win Pointer to MPID_Win object * \return nothing */void MPIDU_Spin_lock_free(MPID_Win *win);/** * \brief Test whether window lock is free * * \param[in] win Pointer to MPID_Win object * \return Boolean TRUE if lock is free */int MPIDU_is_lock_free(MPID_Win *win);/* * * * * * Unlock wait queue * * * * * *//* * * * * * Remote (origin, foreign) Datatype cache * * * * * *//** * \page dtcache_design Datatype Cache Design * * The datatype cache element stores the rank, datatype handle * and the localized datatype object (map and iovec). Builtin * datatypes are not cached (and not sent). * * This cache is used in a split fashion, where "cloned" * cache entries exist on the origin side to tell the origin * when it can skip (re-)sending the datatype. On the target * side the datatype will be fully allocated for each origin. * Because a node may be both an origin at one time and * a target at another, cache entries must be separated since * the handles in the two cases might match but do not indicate * the same datatype. Entries that are origin side dataypes have * the (target) rank with the high bit set. This prevents a * collision between local datatypes we send to that target * and foreign datatypes sent to us from that target. * * Datatype transfers are done in two sends. * * - The first send * consists of the \e MPID_Type_map structure, as generated on * the origin node. * - The second send is the datatype's \e DLOOP_VECTOR, which * defines the contiguous, type-less, regions. * * The actual (original) map and iovec are created/stored in a cache entry * under the origin node. Since the origin node never talks to itself, * this cache entry will never conflict with any remote datatype caching. * * Before any sends are done on the origin node, an attempt is made * to create a new cache entry for this datatype/target rank pair. * If this succeeds, then the datatype has not been sent to the * target before and so will be sent now. Otherwise the entire * transfer of the datatype will be skipped. * * When the target node receives the first send, the callback * attempts to create a datatype cache entry for the datatype/origin * pair. Then a handle-object is created and a receive is setup * into the handle-object map buffer. * * When the target node receives the second send, the callback * allocates a buffer for the iovec. It then sets up to * receive into the dataloop buffer. * * In order to facilitate/optimize cache flushing, a remote (target) * node always receives a datatype that is sent, even if it already * has a cache entry (i.e. it overwrites any existing cache data). * This means that the origin node must only flush its own, local, cache * when a datatype goes away, and if/when a new datatype uses the * same handle then the target side will get a new copy and replace * the old one. *//** * \brief Remove a datatype cache entry * * \param[in] dtp Pointer to MPID_Datatype object to un-cache * \return nothing */void MPIDU_dtc_free(MPID_Datatype *dtp);#ifdef NOT_USED/** * \brief Get Datatype info for a foreign datatype * * Lookup a foreign (remote, origin) datatype in local cache. * Uses origin lpid and (foreign) datatype. * * \param[in] lpid Rank of origin * \param[in] fdt Foreign (origin) datatype handle to search for * \param[out] dti Pointer to datatype info struct * \return 0 if locally cached datatype found, * or 1 if not found. * * \ref dtcache_design */int MPIDU_lookup_dt(int lpid, MPI_Datatype fdt, mpid_dt_info *dti);#endif /* NOT_USED *//** * \brief Prepare to receive a foreign datatype (step 1 - map). * * Called when MPID_MSGTYPE_DT_MAP (first datatype packet) received. * Returns NULL if this datatype is already in the cache. * Since the origin should be mirroring our cache status, * we would expect to never see this case here. * Must be the first of sequence: * - MPID_MSGTYPE_DT_MAP * - MPID_MSGTYPE_DT_IOV * - MPID_MSGTYPE_ACC (_PUT, _GET) * Although, the cache operation is not dependant on any subsequent * RMA operations - i.e. the caching may be done for its own sake. * * Allocates storage for the map and updates cache element. * * mpid_info_w0 = MPID_MSGTYPE_MAP * mpid_info_w1 = map size, bytes * mpid_info_w2 = origin lpid * mpid_info_w3 = foreign datatype handle * mpid_info_w4 = datatype extent * mpid_info_w5 = datatype element type * mpid_info_w6 = datatype element size * mpid_info_w7 = (not used) * * \param[in] mi MPIDU_Onesided_info_t containing data * \return pointer to buffer to receive foreign datatype map * structure, or NULL if datatype is already cached. * * \ref dtcache_design */char *MPID_Prepare_rem_dt(MPIDU_Onesided_info_t *mi);#ifdef NOT_USED/** * \brief Prepare to update foreign datatype (step 2 - iov). * * Called when MPID_MSGTYPE_DT_IOV (second datatype packet) received. * Returns NULL if this datatype is already in the cache. * Must be the second of sequence: * - MPID_MSGTYPE_DT_MAP * - MPID_MSGTYPE_DT_IOV * - MPID_MSGTYPE_ACC (_PUT, _GET) * * Allocates storage for the iov and updates cache element. * * \param[in] lpid Rank of origin * \param[in] fdt Foreign (origin) datatype handle to search for * \param[in] dlz iov size (number of elements) * \return pointer to buffer to receive foreign datatype iov * structure, or NULL if datatype is already cached. * * \ref dtcache_design */char *mpid_update_rem_dt(int lpid, MPI_Datatype fdt, int dlz);#endif /* NOT_USED *//** * \brief completion for datatype cache messages (map and iov) * * To use this callback, the msginfo (DCQuad) must * be filled as follows: * * - \e w0 - extent size * - \e w1 - number of elements in map or iov * - \e w2 - origin rank * - \e w3 - datatype handle on origin * * \param[in] xt Pointer to xtra msginfo saved from original message * \return nothing * */void MPID_Recvdone1_rem_dt(const DCQuad *xt);#ifdef NOT_USED/** * \brief completion for datatype cache messages (map and iov) * * To use this callback, the msginfo (DCQuad) must * be filled as follows: * * - \e w0 - MPID_MSGTYPE_DT_IOV * - \e w1 - number of elements in map or iov * - \e w2 - origin rank * - \e w3 - datatype handle on origin * * \param[in] xt Pointer to xtra msginfo saved from original message * \return nothing * */void mpid_recvdone2_rem_dt(const DCQuad *xt);#endif /* NOT_USED *//** * \brief Checks whether a local datatype has already been cached * at the target node. * * Determine whether a local datatype has already been sent to * this target (and thus is cached over there). * Returns bool TRUE if datatype is (should be) in lpid's cache. * * Should only be called on the origin. * * \param[in] lpid lpid of target * \param[in] dt Local datatype handle to search for * \param[out] dti Pointer to datatype info struct * \return Boolean TRUE if the datatype has already been cached. * * \ref dtcache_design */int MPIDU_check_dt(int lpid, MPI_Datatype dt, mpid_dt_info *dti);/* * * * * * Request object (DCMF_Request_t) cache * * * * * * * because the request object is larger than a cache line, * no attempt is made to keep objects cache-aligned, for example * by padding the header to be the same size as the element or * padding the element to a cache-line size. * * The "piggy-back" data is declared as DCQuad for no special * reason - it was simply a convenient type that contained * adequate space. This component is not used directly as * msginfo in any message layer calls. * *//** * \page rqcache_design Request Object Cache Design * * The request cache element consists of a \e DCMF_Request_t * and a single \e DCQuad that may be used to save context * between the routine that allocated the request object and the * callback that frees it. * * When a request is allocated, the only value returned is * a pointer to the \e DCMF_Request_t field of the cache element. * When a request is freed, the cache must be searched for * a matching element, which is then moved to the free list. * Before the element is moved to the free list, the \e DCQuad * must be copied into a caller-supplied buffer or it will be lost. * * Callbacks that involve a request cache element will call * \e MPIDU_free_req with a \e DCQuad buffer to receive the context * info, if used. Then the context info is examined and action * taken accordingly. Common use for the contaxt info is to * free a buffer involved in a send operation and/or decrement * a counter to indicate completion. *//** * \brief Get a new request object from the resource queue. * * If 'bgq' is not NULL, copy data into request cache element, * otherwise zero the field. * Returns pointer to the request component of the cache element. * * \param[in] bgq Optional pointer to additional info to save * \param[out] info Optional pointer to private info to use * \return Pointer to DCMF request object * * \ref rqcache_design */DCMF_Request_t *MPIDU_get_req(const DCQuad *bgq, MPIDU_Onesided_info_t **info);/** * \brief Release a DCMF request object and retrieve info * * Locate the request object in the request cache and free it. * If 'bgq' is not NULL, copy piggy-back data into 'bgp'. * Assumes request object was returned by a call to MPIDU_get_req(). * * \param[in] req Pointer to DCMF request object being released * \param[out] bgq Optional pointer to receive saved additional info * \return nothing * * \ref rqcache_design */void MPIDU_free_req(DCMF_Request_t *req, DCQuad *bgq);/* * * * * * * Callbacks used on request cache objects * * * * * *//** * \brief Generic request cache done callback with counter decr * * Callback for decrementing a "done" or pending count. * * To use this callback, the "xtra" info (DCQuad) must * be filled as follows: * * - \e w0 - (int *) pending counter * - \e w1 - ignored * - \e w2 - ignored * - \e w3 - ignored * * \param[in] v Pointer to DCMF request object * \return nothing * * \ref rqcache_design */void done_rqc_cb(void *v);#ifdef NOT_USED/** * \brief Generic request cache done callback with counter decr * and 2-buffer freeing. * * Callback for decrementing a "done" or pending count and * freeing malloc() memory, up to two pointers. * * To use this callback, the "xtra" info (DCQuad) must * be filled as follows: * * - \e w0 - (int *) pending counter * - \e w1 - ignored * - \e w2 - (void *) allocated memory if not NULL * - \e w3 - (void *) allocated memory if not NULL * * \param[in] v Pointer to DCMF request object * \return nothing * * \ref rqcache_design */void done_free_rqc_cb(void *v);#endif /* NOT_USED *//** * \brief request cache done callback for Get, with counter decr, * ref count, buffer freeing and dt release when ref count reaches zero. * Also uses dt to unpack results into application buffer. * * Callback for decrementing a "done" or pending count and * freeing malloc() memory, up to two pointers, when ref count goes 0. * * To use this callback, the "xtra" info (DCQuad) must * be filled as follows: * * - \e w0 - (int *) pending counter * - \e w1 - (int *) get struct * - \e w2 - (void *) allocated memory if not NULL * - \e w3 - (void *) allocated memory if not NULL * * \param[in] v Pointer to DCMF request object * \return nothing * * \ref rqcache_design */void done_getfree_rqc_cb(void *v);/** * \brief Generic request cache done callback with counter decr, * ref count, and 2-buffer freeing when ref count reaches zero. * * Callback for decrementing a "done" or pending count and * freeing malloc() memory, up to two pointers, when ref count goes 0. * * To use this callback, the "xtra" info (DCQuad) must * be filled as follows: * * - \e w0 - (int *) pending counter * - \e w1 - (int *) reference counter * - \e w2 - (void *) allocated memory if not NULL * - \e w3 - (void *) allocated memory if not NULL * * \param[in] v Pointer to DCMF request object * \return nothing * * \ref rqcache_design */void done_reffree_rqc_cb(void *v);#ifdef NOT_USED/** * \brief Callback for freeing malloc() memory, up to two pointers. * * To use this callback, the "xtra" info (DCQuad) must * be filled as follows: * * - \e w0 - (void *) allocated memory if not NULL * - \e w1 - (void *) allocated memory if not NULL * - \e w2 - ignored * - \e w3 - ignored * * \param[in] v Pointer to DCMF request object * \return nothing * * \ref rqcache_design */void free_rqc_cb(void *v);#endif /* NOT_USED *//** * \brief Callback invoked to count an RMA operation received * * Increments window's \e my_rma_recvs counter. * If window lock is held, then also increment RMA counter * for specific origin node, and check whether this RMA op * completes the epoch and an unlock is waiting to be processed. * * We use \e rma_sends to count received RMA ops because we * know we won't be using that to count sent RMA ops since * we cannot be in an access epoch while in a LOCK exposure epoch. * * Called from both the "long message" completion callbacks and * the "short message" receive callback, in case of PUT or * ACCUMULATE only.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -