📄 mpi_reduce.3
字号:
.\"Copyright 2006, Sun Microsystems, Inc..\" Copyright (c) 1996 Thinking Machines Corporation.TH MPI_Reduce 3OpenMPI "September 2006" "Open MPI 1.2" " ".SH NAME\fBMPI_Reduce\fP \- Reduces values on all processes within a group. .SH SYNTAX.ft R.SH C Syntax.nf#include <mpi.h>int MPI_Reduce(void *\fIsendbuf\fP, void *\fIrecvbuf\fP, int\fI count\fP, MPI_Datatype\fI datatype\fP, MPI_Op\fI op\fP, int\fI root\fP, MPI_Comm\fI comm\fP).SH Fortran Syntax.nfINCLUDE 'mpif.h'MPI_REDUCE(\fISENDBUF, RECVBUF, COUNT, DATATYPE, OP, ROOT, COMM, IERROR\fP) <type> \fISENDBUF(*), RECVBUF(*)\fP INTEGER \fICOUNT, DATATYPE, OP, ROOT, COMM, IERROR\fP .SH C++ Syntax.nf#include <mpi.h>void MPI::Intracomm::Reduce(const void* \fIsendbuf\fP, void* \fIrecvbuf\fP, int \fIcount\fP, const MPI::Datatype& \fIdatatype\fP, const MPI::Op& \fIop\fP, int \fIroot\fP) const .SH INPUT PARAMETERS.ft R.TP 1isendbufAddress of send buffer (choice)..TP 1icountNumber of elements in send buffer (integer)..TP 1idatatypeData type of elements of send buffer (handle)..TP 1iopReduce operation (handle)..TP 1irootRank of root process (integer)..TP 1icommCommunicator (handle)..SH OUTPUT PARAMETERS.ft R.TP 1irecvbufAddress of receive buffer (choice, significant only at root)..ft R.TP 1iIERRORFortran only: Error status (integer). .SH DESCRIPTION.ft RThe global reduce functions (MPI_Reduce, MPI_Op_create, MPI_Op_free, MPI_Allreduce, MPI_Reduce_scatter, MPI_Scan) perform a global reduce operation (such as sum, max, logical AND, etc.) across all the members of a group. The reduction operation can be either one of a predefined list of operations, or a user-defined operation. The global reduction functions come in several flavors: a reduce that returns the result of the reduction at one node, an all-reduce that returns this result at all nodes, and a scan (parallel prefix) operation. In addition, a reduce-scatter operation combines the functionality of a reduce and a scatter operation..spMPI_Reduce combines the elements provided in the input buffer of each process in the group, using the operation op, and returns the combined value in the output buffer of the process with rank root. The input buffer is defined by the arguments sendbuf, count, and datatype; the output buffer is defined by the arguments recvbuf, count, and datatype; both have the same number of elements, with the same type. The routine is called by all group members using the same arguments for count, datatype, op, root, and comm. Thus, all processes provide input buffers and output buffers of the same length, with elements of the same type. Each process can provide one element, or a sequence of elements, in which case the combine operation is executed element-wise on each entry of the sequence. For example, if the operation is MPI_MAX and the send buffer contains two elements that are floating-point numbers (count = 2 and datatype = MPI_FLOAT), then recvbuf(1) = global max (sendbuf(1)) and recvbuf(2) = global max(sendbuf(2)). .sp.SH USE OF IN-PLACE OPTIONWhen the communicator is an intracommunicator, you can perform a reduce operation in-place (the output buffer is used as the input buffer). Use the variable MPI_IN_PLACE as the value of the root process \fIsendbuf\fR. In this case, the input data is taken at the root from the receive buffer, where it will be replaced by the output data. .spNote that MPI_IN_PLACE is a special kind of value; it has the same restrictions on its use as MPI_BOTTOM..spBecause the in-place option converts the receive buffer into a send-and-receive buffer, a Fortran binding that includes INTENT must mark these as INOUT, not OUT. .sp.SH WHEN COMMUNICATOR IS AN INTER-COMMUNICATOR.spWhen the communicator is an inter-communicator, the root process in the first group combines data from all the processes in the second group and then performs the \fIop\fR operation. The first group defines the root process. That process uses MPI_ROOT as the value of its \fIroot\fR argument. The remaining processes use MPI_PROC_NULL as the value of their \fIroot\fR argument. All processes in the second group use the rank of that root process in the first group as the value of their \fIroot\fR argument. Only the send buffer arguments are significant in the second group, and only the receive buffer arguments are significant in the root process of the first group. .sp .SH PREDEFINED REDUCE OPERATIONS.spThe set of predefined operations provided by MPI is listed below (Predefined Reduce Operations). That section also enumerates the datatypes each operation can be applied to. In addition, users may define their own operations that can be overloaded to operate on several datatypes, either basic or derived. This is further explained in the description of the user-defined operations (see the man pages for MPI_Op_create and MPI_Op_free)..spThe operation op is always assumed to be associative. All predefined operations are also assumed to be commutative. Users may define operations that are assumed to be associative, but not commutative. The ``canonical'' evaluation order of a reduction is determined by the ranks of the processes in the group. However, the implementation can take advantage of associativity, or associativity and commutativity, in order to change the order of evaluation. This may change the result of the reduction for operations that are not strictly associative and commutative, such as floating point addition. .spPredefined operators work only with the MPI types listed below (Predefined Reduce Operations, and the section MINLOC and MAXLOC, below). User-defined operators may operate on general, derived datatypes. In this case, each argument that the reduce operation is applied to is one element described by such a datatype, which may contain several basic values. This is further explained in Section 4.9.4 of the MPI Standard, "User-Defined Operations."The following predefined operations are supplied for MPI_Reduce and related functions MPI_Allreduce, MPI_Reduce_scatter, and MPI_Scan. These operations are invoked by placing the following in op:.sp.nf Name Meaning --------- -------------------- MPI_MAX maximum MPI_MIN minimum MPI_SUM sum MPI_PROD product MPI_LAND logical and MPI_BAND bit-wise and MPI_LOR logical or MPI_BOR bit-wise or MPI_LXOR logical xor MPI_BXOR bit-wise xor MPI_MAXLOC max value and location MPI_MINLOC min value and location .fi.spThe two operations MPI_MINLOC and MPI_MAXLOC are discussed separately below (MINLOC and MAXLOC). For the other predefined operations, we enumerate below the allowed combinations of op and datatype arguments. First, define groups of MPI basic datatypes in the following way:.sp.nf C integer: MPI_INT, MPI_LONG, MPI_SHORT, MPI_UNSIGNED_SHORT, MPI_UNSIGNED, MPI_UNSIGNED_LONG Fortran integer: MPI_INTEGER Floating-point: MPI_FLOAT, MPI_DOUBLE, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_LONG_DOUBLE Logical: MPI_LOGICAL Complex: MPI_COMPLEX Byte: MPI_BYTE .fi.spNow, the valid datatypes for each option is specified below..sp.nf Op Allowed Types ---------------- --------------------------- MPI_MAX, MPI_MIN C integer, Fortran integer, floating-point MPI_SUM, MPI_PROD C integer, Fortran integer, floating-point, complex MPI_LAND, MPI_LOR, C integer, logical MPI_LXOR MPI_BAND, MPI_BOR, C integer, Fortran integer, byte MPI_BXOR.fi.sp\fBExample 1:\fR A routine that computes the dot product of two vectors that are distributed across a group of processes and returns the answer at process zero. .sp.nf SUBROUTINE PAR_BLAS1(m, a, b, c, comm) REAL a(m), b(m) ! local slice of array REAL c ! result (at process zero) REAL sum INTEGER m, comm, i, ierr ! local sum sum = 0.0 DO i = 1, m sum = sum + a(i)*b(i) END DO ! global sum CALL MPI_REDUCE(sum, c, 1, MPI_REAL, MPI_SUM, 0, comm, ierr) RETURN .fi.sp\fBExample 2:\fR A routine that computes the product of a vector and an array that are distributed across a group of processes and returns the answer at process zero..sp.nf SUBROUTINE PAR_BLAS2(m, n, a, b, c, comm) REAL a(m), b(m,n) ! local slice of array REAL c(n) ! result REAL sum(n) INTEGER n, comm, i, j, ierr ! local sum DO j= 1, n sum(j) = 0.0 DO i = 1, m sum(j) = sum(j) + a(i)*b(i,j) END DO END DO ! global sum CALL MPI_REDUCE(sum, c, n, MPI_REAL, MPI_SUM, 0, comm, ierr) ! return result at process zero (and garbage at the other nodes) RETURN.SH MINLOC AND MAXLOC.ftRThe operator MPI_MINLOC is used to compute a global minimum and also an index attached to the minimum value. MPI_MAXLOC similarly computes a global maximum and index. One application of these is to compute a global minimum (maximum) and the rank of the process containing this value. .spThe operation that defines MPI_MAXLOC is .sp.nf ( u ) ( v ) ( w ) ( ) o ( ) = ( ) ( i ) ( j ) ( k )where w = max(u, v)and ( i if u > v ( k = ( min(i, j) if u = v ( ( j if u < v) MPI_MINLOC is defined similarly: ( u ) ( v ) ( w ) ( ) o ( ) = ( ) ( i ) ( j ) ( k )where w = max(u, v)and ( i if u < v
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -