📄 libgomp.texi
字号:
@node GOMP_STACKSIZE@section @env{GOMP_STACKSIZE} -- Set default thread stack size@cindex Environment Variable@cindex Implementation specific setting@table @asis@item @emph{Description}:Set the default thread stack size in kilobytes. This is in opposition to @code{pthread_attr_setstacksize} which gets the number of bytes as an argument. If the stacksize can not be set due to system constraints, an error is reported and the initial stacksize is left unchanged. If undefined,the stack size is system dependent.@item @emph{Reference}: @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html, GCC Patches Mailinglist}, @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,GCC Patches Mailinglist}@end table@c ---------------------------------------------------------------------@c The libgomp ABI@c ---------------------------------------------------------------------@node The libgomp ABI@chapter The libgomp ABIThe following sections present notes on the external ABI as presented by libgomp. Only maintainers should need them.@menu* Implementing MASTER construct::* Implementing CRITICAL construct::* Implementing ATOMIC construct::* Implementing FLUSH construct::* Implementing BARRIER construct::* Implementing THREADPRIVATE construct::* Implementing PRIVATE clause::* Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::* Implementing REDUCTION clause::* Implementing PARALLEL construct::* Implementing FOR construct::* Implementing ORDERED construct::* Implementing SECTIONS construct::* Implementing SINGLE construct::@end menu@node Implementing MASTER construct@section Implementing MASTER construct@smallexampleif (omp_get_thread_num () == 0) block@end smallexampleAlternately, we generate two copies of the parallel subfunctionand only include this in the version run by the master thread.Surely that's not worthwhile though...@node Implementing CRITICAL construct@section Implementing CRITICAL constructWithout a specified name,@smallexample void GOMP_critical_start (void); void GOMP_critical_end (void);@end smallexampleso that we don't get COPY relocations from libgomp to the mainapplication.With a specified name, use omp_set_lock and omp_unset_lock withname being transformed into a variable declared like@smallexample omp_lock_t gomp_critical_user_<name> __attribute__((common))@end smallexampleIdeally the ABI would specify that all zero is a valid unlockedstate, and so we wouldn't actually need to initialize this atstartup.@node Implementing ATOMIC construct@section Implementing ATOMIC constructThe target should implement the @code{__sync} builtins.Failing that we could add@smallexample void GOMP_atomic_enter (void) void GOMP_atomic_exit (void)@end smallexamplewhich reuses the regular lock code, but with yet another lockobject private to the library.@node Implementing FLUSH construct@section Implementing FLUSH constructExpands to the @code{__sync_synchronize} builtin.@node Implementing BARRIER construct@section Implementing BARRIER construct@smallexample void GOMP_barrier (void)@end smallexample@node Implementing THREADPRIVATE construct@section Implementing THREADPRIVATE constructIn _most_ cases we can map this directly to @code{__thread}. Exceptthat OMP allows constructors for C++ objects. We can eitherrefuse to support this (how often is it used?) or we can implement something akin to .ctors.Even more ideally, this ctor feature is handled by extensionsto the main pthreads library. Failing that, we can have a setof entry points to register ctor functions to be called.@node Implementing PRIVATE clause@section Implementing PRIVATE clauseIn association with a PARALLEL, or within the lexical extentof a PARALLEL block, the variable becomes a local variable inthe parallel subfunction.In association with FOR or SECTIONS blocks, create a newautomatic variable within the current function. This preservesthe semantic of new variable creation.@node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses@section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clausesSeems simple enough for PARALLEL blocks. Create a private struct for communicating between parent and subfunction.In the parent, copy in values for scalar and "small" structs;copy in addresses for others TREE_ADDRESSABLE types. In the subfunction, copy the value into the local variable.Not clear at all what to do with bare FOR or SECTION blocks.The only thing I can figure is that we do something like@smallexample#pragma omp for firstprivate(x) lastprivate(y)for (int i = 0; i < n; ++i) body;@end smallexamplewhich becomes@smallexample@{ int x = x, y; // for stuff if (i == n) y = y;@}@end smallexamplewhere the "x=x" and "y=y" assignments actually have differentuids for the two variables, i.e. not something you could writedirectly in C. Presumably this only makes sense if the "outer"x and y are global variables.COPYPRIVATE would work the same way, except the structure broadcast would have to happen via SINGLE machinery instead.@node Implementing REDUCTION clause@section Implementing REDUCTION clauseThe private struct mentioned in the previous section should have a pointer to an array of the type of the variable, indexed by the thread's @var{team_id}. The thread stores its final value into the array, and after the barrier the master thread iterates over thearray to collect the values.@node Implementing PARALLEL construct@section Implementing PARALLEL construct@smallexample #pragma omp parallel @{ body; @}@end smallexamplebecomes@smallexample void subfunction (void *data) @{ use data; body; @} setup data; GOMP_parallel_start (subfunction, &data, num_threads); subfunction (&data); GOMP_parallel_end ();@end smallexample@smallexample void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)@end smallexampleThe @var{FN} argument is the subfunction to be run in parallel.The @var{DATA} argument is a pointer to a structure used to communicate data in and out of the subfunction, as discussedabove with respect to FIRSTPRIVATE et al.The @var{NUM_THREADS} argument is 1 if an IF clause is presentand false, or the value of the NUM_THREADS clause, ifpresent, or 0.The function needs to create the appropriate number ofthreads and/or launch them from the dock. It needs tocreate the team structure and assign team ids.@smallexample void GOMP_parallel_end (void)@end smallexampleTears down the team and returns us to the previous @code{omp_in_parallel()} state.@node Implementing FOR construct@section Implementing FOR construct@smallexample #pragma omp parallel for for (i = lb; i <= ub; i++) body;@end smallexamplebecomes@smallexample void subfunction (void *data) @{ long _s0, _e0; while (GOMP_loop_static_next (&_s0, &_e0)) @{ long _e1 = _e0, i; for (i = _s0; i < _e1; i++) body; @} GOMP_loop_end_nowait (); @} GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0); subfunction (NULL); GOMP_parallel_end ();@end smallexample@smallexample #pragma omp for schedule(runtime) for (i = 0; i < n; i++) body;@end smallexamplebecomes@smallexample @{ long i, _s0, _e0; if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0)) do @{ long _e1 = _e0; for (i = _s0, i < _e0; i++) body; @} while (GOMP_loop_runtime_next (&_s0, _&e0)); GOMP_loop_end (); @}@end smallexampleNote that while it looks like there is trickyness to propagatinga non-constant STEP, there isn't really. We're explicitly allowedto evaluate it as many times as we want, and any variables involvedshould automatically be handled as PRIVATE or SHARED like any othervariables. So the expression should remain evaluable in the subfunction. We can also pull it into a local variable if we like,but since its supposed to remain unchanged, we can also not if we like.If we have SCHEDULE(STATIC), and no ORDERED, then we ought to beable to get away with no work-sharing context at all, since we cansimply perform the arithmetic directly in each thread to divide upthe iterations. Which would mean that we wouldn't need to call anyof these routines.There are separate routines for handling loops with an ORDEREDclause. Bookkeeping for that is non-trivial...@node Implementing ORDERED construct@section Implementing ORDERED construct@smallexample void GOMP_ordered_start (void) void GOMP_ordered_end (void)@end smallexample@node Implementing SECTIONS construct@section Implementing SECTIONS constructA block as @smallexample #pragma omp sections @{ #pragma omp section stmt1; #pragma omp section stmt2; #pragma omp section stmt3; @}@end smallexamplebecomes@smallexample for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ()) switch (i) @{ case 1: stmt1; break; case 2: stmt2; break; case 3: stmt3; break; @} GOMP_barrier ();@end smallexample@node Implementing SINGLE construct@section Implementing SINGLE constructA block like @smallexample #pragma omp single @{ body; @}@end smallexamplebecomes@smallexample if (GOMP_single_start ()) body; GOMP_barrier ();@end smallexamplewhile @smallexample #pragma omp single copyprivate(x) body;@end smallexamplebecomes@smallexample datap = GOMP_single_copy_start (); if (datap == NULL) @{ body; data.x = x; GOMP_single_copy_end (&data); @} else x = datap->x; GOMP_barrier ();@end smallexample@c ---------------------------------------------------------------------@c @c ---------------------------------------------------------------------@node Reporting Bugs@chapter Reporting BugsBugs in the GNU OpenMP implementation should be reported via @uref{http://gcc.gnu.org/bugzilla/, bugzilla}. In all cases, please add "openmp" to the keywords field in the bug report.@c ---------------------------------------------------------------------@c GNU General Public License@c ---------------------------------------------------------------------@include gpl.texi@c ---------------------------------------------------------------------@c GNU Free Documentation License@c ---------------------------------------------------------------------@include fdl.texi@c ---------------------------------------------------------------------@c Funding Free Software@c ---------------------------------------------------------------------@include funding.texi@c ---------------------------------------------------------------------@c Index@c ---------------------------------------------------------------------@node Index@unnumbered Index@printindex cp@bye
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -