📄 orterun.1
字号:
Using the \fI--bynode\fP option tells Open MPI to use all available nodes.Using the \fI--byslot\fP option tells Open MPI to use all slots on an availablenode before allocating resources on the next available node.For example:..TP 4mpirun --bynode -np 4 a.outRuns one copy of the the executable.I a.outon all available nodes in the Open MPI universe. MPI_COMM_WORLD rank 0will be on node0, rank 1 will be on node1, etc. Regardless of how many slotsare available on each of the nodes....TPmpirun --byslot -np 4 a.outRuns one copy of the the executable.I a.outon each slot on a given node before running the executable on other availablenodes.....SS Specifying Hosts.Hosts can be specified in a number of ways. The most common of which is in a'hostfile' or 'machinefile'. If our hostfile contain the following information:.. \fBshell$\fP cat my-hostfile node00 slots=2 node01 slots=2 node02 slots=2...TPmpirun --hostfile my-hostfile -np 3 a.outThis will run one copy of the executable.I a.outon hosts node00,node01, and node02....PPAnother method for specifying hosts is directly on the command line. Here cancan include and exclude hosts from the set of hosts to run on. For example:...TPmpirun -np 3 --host a a.outRuns three copies of the executable.I a.outon host a....TPmpirun -np 3 --host a,b,c a.outRuns one copy of the executable.I a.outon hosts a, b, and c....TPmpirun -np 3 --hostfile my-hostfile --host node00 a.outRuns three copies of the executable.I a.outon host node00....TPmpirun -np 3 --hostfile my-hostfile --host node10 a.outThis will prompt an error since node10 is not in my-hostfile; mpirun willabort....TPshell$ mpirun -np 1 --host a hostname : -np 2 --host b,c uptimeRuns one copy of the executable.I hostnameon host a. And runs one copy of the executable.I uptimeon hosts b and c.....SS No Local Launch.Using the \fB--nolocal\fR option to orterun tells the system to notlaunch any of the application processes on the same node that orterunis running. While orterun typically blocks and consumes few systemresources, this option can be helpful for launching very large jobswhere orterun may actually need to use noticable amounts of memoryand/or processing time. \fB--nolocal\fR allows orteun to run withoutsharing the local node with the launched applications, and likewiseallows the launched applications to run unhindered by orterun's systemusage..PPNote that \fB--nolocal\fR will override any other specification tolaunch the application on the local node. It will disqualify thelocalhost from being capable of running any processes in theapplication....TPshell$ mpirun -np 1 --host localhost --nolocal hostnameThis example will result in an error because orterun will not findanywhere to launch the application.....SS No Oversubscription.Using the \fI--nooversubscribe\fR option causes Open MPI to implicitlyset the "max_slots" value to be the same as the "slots" value for eachnode. This can be especially helpful when running jobs under aresource manager because Open MPI currently only sets the "slots"value for each node that it obtains from the resource manager.....SS Application Context or Executable Program?.To distinguish the two different forms, \fImpirun\fPlooks on the command line for \fI--app\fP option. Ifit is specified, then the file named on the command line isassumed to be an application context. If it is notspecified, then the file is assumed to be an executable program.....SS Locating Files.If \fIno\fP relative or absolute path is specified for a file, Open MPIwill look for files by searching the directories in the user's PATH environmentvariable as defined on the source node(s)..PPIf a relative directory is specified, it must be relative to the initialworking directory determined by the specific starter used. For example whenusing the rsh or ssh starters, the initial directory is $HOME by default. Otherstarters may set the initial directory to the current working directory fromthe invocation of \fImpirun\fP. ....SS Current Working Directory.The \fI\-wdir\fP mpirun option (and its synonym, \fI\-wd\fP) allowsthe user to change to an arbitrary directory before the program isinvoked. It can also be used in application context files to specifyworking directories on specific nodes and/or for specificapplications..PPIf the \fI\-wdir\fP option appears both in a context file and on thecommand line, the context file directory will override the commandline value..PPIf the \fI-wdir\fP option is specified, Open MPI will attempt tochange to the specified directory on all of the remote nodes. If thisfails, \fImpirun\fP will abort..PPIf the \fI-wdir\fP option is \fBnot\fP specified, Open MPI will sendthe directory name where \fImpirun\fP was invoked to each of theremote nodes. The remote nodes will try to change to thatdirectory. If they are unable (e.g., if the directory does not exit onthat node), then Open MPI will use the default directory determined bythe starter..PPAll directory changing occurs before the user's program is invoked; itdoes not wait until \fIMPI_INIT\fP is called. ....SS Standard I/O.Open MPI directs UNIX standard input to /dev/null on all processesexcept the MPI_COMM_WORLD rank 0 process. The MPI_COMM_WORLD rank 0 processinherits standard input from \fImpirun\fP..B Note:The node that invoked \fImpirun\fP need not be the same as the node where theMPI_COMM_WORLD rank 0 process resides. Open MPI handles the redirection of\fImpirun\fP's standard input to the rank 0 process..PPOpen MPI directs UNIX standard output and error from remote nodes to the nodethat invoked \fImpirun\fP and prints it on the standard output/error of\fImpirun\fP.Local processes inherit the standard output/error of \fImpirun\fP and transferto it directly..PPThus it is possible to redirect standard I/O for Open MPI applications byusing the typical shell redirection procedure on \fImpirun\fP. \fBshell$\fP mpirun -np 2 my_app < my_input > my_outputNote that in this example \fIonly\fP the MPI_COMM_WORLD rank 0 process willreceive the stream from \fImy_input\fP on stdin. The stdin on all the othernodes will be tied to /dev/null. However, the stdout from all nodes willbe collected into the \fImy_output\fP file. ....SS Signal Propagation.When orterun receives a SIGTERM and SIGINT, it will attempt to killthe entire job by sending all processes in the job a SIGTERM, waitinga small number of seconds, then sending all processes in the job aSIGKILL..SIGUSR1 and SIGUSR2 signals received by orterun are propagated toall processes in the job. Other signals are not currently propagatedby orterun....SS Process Termination / Signal Handling.During the run of an MPI application, if any rank dies abnormally(either exiting before invoking \fIMPI_FINALIZE\fP, or dying as the result of asignal), \fImpirun\fP will print out an error message and kill the rest of theMPI application..PPUser signal handlers should probably avoid trying to cleanup MPI state(Open MPI is, currently, neither thread-safe nor async-signal-safe).For example, if a segmentation fault occurs in \fIMPI_SEND\fP (perhaps becausea bad buffer was passed in) and a user signal handler is invoked, if this userhandler attempts to invoke \fIMPI_FINALIZE\fP, Bad Things could happen sinceOpen MPI was already "in" MPI when the error occurred. Since \fImpirun\fPwill notice that the process died due to a signal, it is probably notnecessary (and safest) for the user to only clean up non-MPI state.....SS Process Environment.Processes in the MPI application inherit their environment from theOpen RTE daemon upon the node on which they are running. Theenvironment is typically inherited from the user's shell. On remotenodes, the exact environment is determined by the boot MCA moduleused. The \fIrsh\fR launch module, for example, uses either\fIrsh\fR/\fIssh\fR to launch the Open RTE daemon on remote nodes, andtypically executes one or more of the user's shell-setup files beforelaunching the Open RTE daemon. When running dynamically linkedapplications which require the \fILD_LIBRARY_PATH\fR environmentvariable to be set, care must be taken to ensure that it is correctlyset when booting Open MPI..PPSee the "Remote Execution" section for more details....SS Remote Execution.Open MPI requires that the \fIPATH\fR environment variable be set tofind executables on remote nodes (this is typically only necessary in\fIrsh\fR- or \fIssh\fR-based environments -- batch/scheduledenvironments typically copy the current environment to the executionof remote jobs, so if the current environment has \fIPATH\fR and/or\fILD_LIBRARY_PATH\fR set properly, the remote nodes will also have itset properly). If Open MPI was compiled with shared library support,it may also be necessary to have the \fILD_LIBRARY_PATH\fR environmentvariable set on remote nodes as well (especially to find the sharedlibraries required to run user MPI applications)..PPHowever, it is not always desirable or possible to edit shellstartup files to set \fIPATH\fR and/or \fILD_LIBRARY_PATH\fR. The\fI--prefix\fR option is provided for some simple configurations wherethis is not possible..PPThe \fI--prefix\fR option takes a single argument: the base directoryon the remote node where Open MPI is installed. Open MPI will usethis directory to set the remote \fIPATH\fR and \fILD_LIBRARY_PATH\fRbefore executing any Open MPI or user applications. This allowsrunning Open MPI jobs without having pre-configued the \fIPATH\fR and\fILD_LIBRARY_PATH\fR on the remote nodes..PPOpen MPI adds the basename of the currentnode's "bindir" (the directory where Open MPI's executables areinstalled) to the prefix and uses that to set the \fIPATH\fR on theremote node. Similarly, Open MPI adds the basename of the currentnode's "libdir" (the directory where Open MPI's libraries areinstalled) to the prefix and uses that to set the\fILD_LIBRARY_PATH\fR on the remote node. For example:.TP 15Local bindir:/local/node/directory/bin.TPLocal libdir:/local/node/directory/lib64.PPIf the following command line is used: \fBshell$\fP mpirun --prefix /remote/node/directoryOpen MPI will add "/remote/node/directory/bin" to the \fIPATH\fRand "/remote/node/directory/lib64" to the \fLD_LIBRARY_PATH\fR on theremote node before attempting to execute anything..PPNote that \fI--prefix\fR can be set on a per-context basis, allowingfor different values for different nodes..PPThe \fI--prefix\fR option is not sufficient if the installation pathson the remote node are different than the local node (e.g., if "/lib"is used on the local node, but "/lib64" is used on the remote node),or if the installation paths are something other than a subdirectoryunder a common prefix. .PPNote that executing \fImpirun\fR via an absolute pathname isequivalent to specifying \fI--prefix\fR without the last subdirectoryin the absolute pathname to \fImpirun\fR. For example: \fBshell$\fP /usr/local/bin/mpirun ...is equivalent to \fBshell$\fP mpirun --prefix /usr/local....SS Exported Environment Variables.All environment variables that are named in the form OMPI_* will automaticallybe exported to new processes on the local and remote nodes.The \fI\-x\fP option to \fImpirun\fP can be used to export specific environmentvariables to the new processes. While the syntax of the \fI\-x\fPoption allows the definition of new variables, note that the parserfor this option is currently not very sophisticated - it does not evenunderstand quoted values. Users are advised to set variables in theenvironment and use \fI\-x\fP to export them; not to define them.....SS MCA (Modular Component Architecture).The \fI-mca\fP switch allows the passing of parameters to various MCA modules..\" Open MPI's MCA modules are described in detail in ompimca(7).MCA modules have direct impact on MPI programs because they allow tunableparameters to be set at run time (such as which BTL communication device driverto use, what parameters to pass to that BTL, etc.)..PPThe \fI-mca\fP switch takes two arguments: \fI<key>\fP and \fI<value>\fP.The \fI<key>\fP argument generally specifies which MCA module will receive the value.For example, the \fI<key>\fP "btl" is used to select which BTL to be used fortransporting MPI messages. The \fI<value>\fP argument is the value that ispassed.For example: ..TP 4mpirun -mca btl tcp,self -np 1 fooTells Open MPI to use the "tcp" and "self" BTLs, and to run a single copy of"foo" an allocated node...TPmpirun -mca btl self -np 1 fooTells Open MPI to use the "self" BTL, and to run a single copy of "foo" anallocated node..\" And so on. Open MPI's BTL MCA modules are described in ompimca_btl(7)..PPThe \fI-mca\fP switch can be used multiple times to specify different\fI<key>\fP and/or \fI<value>\fP arguments. If the same \fI<key>\fP isspecified more than once, the \fI<value>\fPs are concatenated with a comma(",") separating them..PP.B Note:The \fI-mca\fP switch is simply a shortcut for setting environment variables.The same effect may be accomplished by setting corresponding environmentvariables before running \fImpirun\fP.The form of the environment variables that Open MPI sets are: OMPI_<key>=<value>.PPNote that the \fI-mca\fP switch overrides any previously set environmentvariables. Also note that unknown \fI<key>\fP arguments are still set asenvironment variable -- they are not checked (by \fImpirun\fP) for correctness.Illegal or incorrect \fI<value>\fP arguments may or may not be reported -- itdepends on the specific MCA module...\" **************************.\" Examples Section.\" **************************.SH EXAMPLESBe sure to also see the examples in the "Location Nomenclature" section, above...TP 4mpirun -np 1 prog1Load and execute prog1 on one node. Search the user's $PATH for theexecutable file on each node....TPmpirun -np 8 --byslot prog1Run 8 copies of prog1 wherever Open MPI wants to run them....TPmpirun -np 4 -mca btl ib,tcp,self prog1Run 4 copies of prog1 using the "ib", "tcp", and "self" BTL's for the transportof MPI messages...\" **************************.\" Diagnostics Section.\" **************************..\" .SH DIAGNOSTICS.\".TP 4.\"Error Msg:.\"Description..\" **************************.\" Return Value Section.\" **************************..SH RETURN VALUE.\fImpirun\fP returns 0 if all ranks started by \fImpirun\fP exit after callingMPI_FINALIZE. A non-zero value is returned if an internal error occurred inmpirun, or one or more ranks exited before calling MPI_FINALIZE. If aninternal error occurred in mpirun, the corresponding error code is returned.In the event that one or more ranks exit before calling MPI_FINALIZE, thereturn value of the rank of the process that \fImpirun\fP first notices diedbefore calling MPI_FINALIZE will be returned. Note that, in general, this willbe the first rank that died but is not guaranteed to be so..PPHowever, note that if the \fI-nw\fP switch is used, the return value frommpirun does not indicate the exit status of the ranks...\" **************************.\" See Also Section.\" **************************..\" .SH SEE ALSO.\" orted(1)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -