📄 xcu_chap03.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><meta name="generator" content="HTML Tidy, see www.w3.org"><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"><link type="text/css" rel="stylesheet" href="style.css"><!-- Generated by The Open Group's rhtm tool v1.2.1 --><!-- Copyright (c) 2001-2003 The Open Group, All Rights Reserved --><title>Rationale</title></head><body><basefont size="3"> <center><font size="2">The Open Group Base Specifications Issue 6<br>IEEE Std 1003.1, 2003 Edition<br>Copyright © 2001-2003 The IEEE and The Open Group</font></center><hr size="2" noshade><h3><a name="tag_02_03"></a>Batch Environment Services and Utilities</h3><h5><a name="tag_02_03_00_01"></a>Scope of the Batch Environment Services and Utilities Option</h5><p>This section summarizes the deliberations of the IEEE P1003.15 (Batch Environment) working group in the development of the BatchEnvironment Services and Utilities option, which covers a set of services and utilities defining a batch processing system.</p><p>This informative section contains historical information concerning the contents of the amendment and describes why featureswere included or discarded by the working group.</p><h5><a name="tag_02_03_00_02"></a>History of Batch Systems</h5><p>The supercomputing technical committee began as a "Birds Of a Feather" (BOF) at the January 1987 Usenix meeting. There wasenough general interest to form a supercomputing attachment to the /usr/group working groups. Several subgroups rapidly formed. Ofthose subgroups, the batch group was the most ambitious. The first early meetings were spent evaluating user needs and existingbatch implementations.</p><p>To evaluate user needs, individuals from the supercomputing community came and presented their needs. Common requests wereflexibility, interoperability, control of resources, and ease-of-use. Backward-compatibility was not an issue. The working groupthen evaluated some existing systems. The following different systems were evaluated:</p><ul><li><p>PROD</p></li><li><p>Convex Distributed Batch</p></li><li><p>NQS</p></li><li><p>CTSS</p></li><li><p>MDQS from Ballistics Research Laboratory (BRL)</p></li></ul><p>Finally, NQS was chosen as a model because it satisfied not only the most user requirements, but because it was public domain,already implemented on a variety of hardware platforms, and network-based.</p><h5><a name="tag_02_03_00_03"></a>Historical Implementations of Batch Systems</h5><p>Deferred processing of work under the control of a scheduler has been a feature of most proprietary operating systems from theearliest days of multi-user systems in order to maximize utilization of the computer.</p><p>The arrival of UNIX systems proved to be a dilemma to many hardware providers and users because it did not include thesophisticated batch facilities offered by the proprietary systems. This omission was rectified in 1986 by NASA Ames Research Centerwho developed the Network Queuing System (NQS) as a portable UNIX application that allowed the routing and processing of batch"jobs" in a network. To encourage its usage, the product was later put into the public domain. It was promptly picked up by UNIXhardware providers, and ported and developed for their respective hardware and UNIX implementations.</p><p>Many major vendors, who traditionally offer a batch-dominated environment, ported the public-domain product to their systems,customized it to support the capabilities of their systems, and added many customer-requested features.</p><p>Due to the strong hardware provider and customer acceptance of NQS, it was decided to use NQS as the basis for the POSIX BatchEnvironment amendment in 1987. Other batch systems considered at the time included CTSS, MDQS (a forerunner of NQS from theBallistics Research Laboratory), and PROD (a Los Alamos Labs development). None were thought to have both the functionality andacceptability of NQS.</p><h5><a name="tag_02_03_00_04"></a>NQS Differences from the at utility</h5><p>The base standard <a href="../utilities/at.html"><i>at</i></a> and <a href="../utilities/batch.html"><i>batch</i></a> utilitiesare not sufficient to meet the batch processing needs in a supercomputing environment and additional functionality in the areas ofresource management, job scheduling, system management, and control of output is required.</p><h5><a name="tag_02_03_00_05"></a>Batch Environment Services and Utilities Option Definitions</h5><p>The concept of a batch job is closely related to a session with a session leader. The main difference is that a batch job doesnot have a controlling terminal. There has been much debate over whether to use the term "request" or "job". Job was the finalchoice because of the historical use of this term in the batch environment.</p><p>The current definition for job identifiers is not sufficient with the model of destinations. The current definition is:</p><blockquote><pre><tt>sequence_number.originating_host</tt></pre></blockquote><p>Using the model of destination, a host may include multiple batch nodes, the location of which is identified uniquely by a nameor directory service. If the current definition is used, batch nodes running on the same host would have to coordinate their use ofsequence numbers, as sequence numbers are assigned by the originating host. The alternative is to use the originating batch nodename instead of the originating host name.</p><p>The reasons for wishing to run more than one batch system per host could be the following.</p><p>A test and production batch system are maintained on a single host. This is most likely in a development facility, but couldalso arise when a site is moving from one version to another. The new batch system could be installed as a test version that iscompletely separate from the production batch system, so that problems can be isolated to the test system. Requiring the batchnodes to coordinate their use of sequence numbers creates a dependency between the two nodes, and that defeats the purpose ofrunning two nodes.</p><p>A site has multiple departments using a single host, with different management policies. An example of contention might be injob selection algorithms. One group might want a FIFO type of selection, while another group wishes to use a more complex algorithmbased on resource availability. Again, requiring the batch nodes to coordinate is an unnecessary binding.</p><p>The proposal eventually accepted was to replace originating host with originating batch node. This supplies sufficientgranularity to ensure unique job identifiers. If more than one batch node is on a particular host, they each have their own uniquename.</p><p>The queue portion of a destination is not part of the job identifier as these are not required to be unique between batch nodes.For instance, two batch nodes may both have queues called small, medium, and large. It is only the batch node name that is uniquelyidentifiable throughout the batch system. The queue name has no additional function in this context.</p><p>Assume there are three batch nodes, each of which has its own name server. On batch node one, there are no queues. On batch nodetwo, there are fifty queues. On batch node three, there are forty queues. The system administrator for batch node one does not haveto configure queues, because there are none implemented. However, if a user wishes to send a job to either batch node two or three,the system administrator for batch node one must configure a destination that maps to the appropriate batch node and queue. Ifevery queue is to be made accessible from batch node one, the system administrator has to configure ninety destinations.</p><p>To avoid requiring this, there should be a mechanism to allow a user to separate the destination into a batch node name and aqueue name. Then, an implementation that is configured to get to all the batch nodes does not need any more configuration to allowa user to get to all of the queues on all of the batch nodes. The node name is used to locate the batch node, while the queue nameis sent unchanged to that batch node.</p><p>The following are requirements that a destination identifier must be capable of providing:</p><ul><li><p>The ability to direct a job to a queue in a particular batch node.</p></li><li><p>The ability to direct a job to a particular batch node.</p></li><li><p>The ability to group at a higher level than just one queue. This includes grouping similar queues across multiple batch nodes(this is a pipe queue).</p></li><li><p>The ability to group batch nodes. This allows a user to submit a job to a group name with no knowledge of the batch nodeconfiguration. This also provides aliasing as a special case. Aliasing is a group containing only one batch node name. The groupname is the alias.</p></li></ul><p>In addition, the administrator has the following requirements:</p><ul><li><p>The ability to control access to the queues.</p></li><li><p>The ability to control access to the batch nodes.</p></li><li><p>The ability to control access to groups of queues (pipe queues).</p></li><li><p>The ability to configure retry time intervals and durations.</p></li></ul><p>The requirements of the user are met by destination as explained in the following.</p><p>The user has the ability to specify a queue name, which is known only to the batch node specified. There is no configuration ofthese queues required on the submitting node.</p><p>The user has the ability to specify a batch node whose name is network-unique. The configuration required is that the batch nodebe defined as an application, just as other applications such as FTP are configured.</p><p>Once a job reaches a queue, it can again become a user of the batch system. The batch node can choose to send the job to anotherbatch node or queue or both. In other words, the routing is at an application level, and it is up to the batch system to choosewhere the job will be sent. Configuration is up to the batch node where the queue resides. This provides grouping of queues acrossbatch nodes or within a batch node. The user submits the job to a queue, which by definition routes the job to other queues ornodes or both.</p><p>A node name may be given to a naming service, which returns multiple addresses as opposed to just one. This provides grouping ata batch node level. This is a local issue, meaning that the batch node must choose only one of these addresses. The list ofaddresses is not sent with the job, and once the job is accepted on another node, there is no connection between the list and thejob. The requirements of the administrator are met by destination as explained in the following.</p><p>The control of queues is a batch system issue, and will be done using the batch administrative utilities.</p><p>The control of nodes is a network issue, and will be done through whatever network facilities are available.</p><p>The control of access to groups of queues (pipe queues) is covered by the control of any other queue. The fact that the job maythen be sent to another destination is not relevant.</p><p>The propagation of a job across more than one point-to-point connection was dropped because of its complexity and because all ofthe issues arising from this capability could not be resolved. It could be provided as additional functionality at some time in the
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -