⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 admin_guide.txt

📁 openPBS的开放源代码
💻 TXT
📖 第 1 页 / 共 2 页
字号:
resources indicate that doing so would otherwise cause a problem.The values below attempt to maintain a system load within 75% to 100% ofthe theoretical maximum (load average of 48.0 to 64.0 for a 64-cpu machine).TARGET_LOAD_PCT			90%		TARGET_LOAD_VARIANCE		-15%,+10%The next section of options are used to enforce site-specific policies. Itis a good idea to reevaluate these policies as the user community grows,shrinks, or changes its focus from porting and debugging to production.Check for Prime Time Enforcement.  Sites with a mixed user base can use this option to enforce separate scheduling policies at different timesduring the day. If ENFORCE_PRIME_TIME is set to "False", the non-prime-timescheduling policy (as described in BATCH_QUEUES) will be used for the entire24 hour period.ENFORCE_PRIME_TIME		FalsePrime-time is defined as a time period each working day (Mon-Fri)from PRIME_TIME_START through PRIME_TIME_END.  Times are in 24hour format (i.e. 9:00AM is 9:00:00, 5:00PM is 17:00:00) with hours, minutes, and seconds.  Sites can use the prime-time scheduling policy for the entire 24 hour period by setting PRIME_TIME_START and PRIME_TIME_END back-to-back.  The portion of a job that fits within primetime must beno longer than PRIME_TIME_WALLT_LIMIT (represented in HH:MM:SS).#PRIME_TIME_START		9:00:00#PRIME_TIME_END			17:00:00#PRIME_TIME_WALLT_LIMIT		1:00:00The next option allows the site to choose an action to take upon schedulerstartup.  The default is to do no special processing (NONE). In someinstances, a job can end up queued in one of the batch queues, since itwas running before but was stopped by PBS. If the argument is RESUBMIT,these jobs will be moved back to the queue the job was originally submittedto, and scheduled as if they had just arrived. If the argument is RERUN,the scheduler will have PBS run any jobs found enqueued on the executionqueues. This may cause the machine to get somewhat confused, as no limitschecking is done (the assumption being that they were checked when theywere enqueued).SCHED_RESTART_ACTION		RESUBMITIf the following directive points to a valid file, the scheduler willdump a listing of all the jobs, in the order it would like to run them,to this file. This is disabled (commented) by default.#SORTED_JOB_DUMPFILE            /PBS/JAMES/sched_priv/sorted_jobsThe Fair Access Directives allow the specification, on a per-queue basis,of a per-user limit on the maximum number of simultaniously running jobsand a 'time-credit' of the maximum number of minutes left to run a givenuser can have outstanding at any given time. The username of "default"is interpreted as the default values for that queue. The format of the<access_spec> is:FAIR_ACCESS QUEUE:queuename:username:max_jobs:max_time_credit_minutes#FAIR_ACCESS QUEUE:firstQ:myuserA:10:400#FAIR_ACCESS QUEUE:firstQ:default:15:1000#FAIR_ACCESS QUEUE:thirdQ:myuserA:11:200* Lazy CommentingBecause changing the job comment for each of a large group of jobs can bevery expensive, there is a notion of lazy comments. The function that setsthe comment on a job takes a flag that indicates whether or not the commentis optional.  Most of the "can't run because ..." comments are consideredto be optional.When presented with an optional comment, the job will only be altered ifthe job was enqueued after the last run of the scheduler, if it does notalready have a comment, or the job's 'mtime' (modification time) attributeindicates that the job has not been touched in MIN_COMMENT_AGE seconds.This should provide each job with a comment at least once per schedulerlifetime.  It also provides an upper bound (MIN_COMMENT_AGE seconds + thescheduling iteration) on the time between comment updates.This compromise seemed reasonable because the comments themselves are some-what arbitrary, so keeping them up-to-date is not a high priority.Installing The Custom Scheduler-------------------------------The custom scheduler is packaged an optional scheduler for the OpenPBS v.2.3source code tree.Rebuilding PBS to use custom scheduler--------------------------------------This custom scheduler requires modifications to the PBS batch jobstructure (which is compiled into all PBS daemons): the addition ofthe "speed" and "tmpdir" job attributes, which allow the user to specifythe speed (in Mhz) of the execution host and the amount of space neededon /tmp, respectivly. Ver.2.x of the scheduler supports nine attributesthat are "reserved for future use" (see the New Features section). Due to these modifications it is necessary to rebuild all of PBS. It is suggested that a clean build be performed, as follows (notethat $PBSSRC refers to the top of the PBS source tree-- where thefile configure is; and that $PBSOBJ refers to the top of the objecttree where PBS is built):    cd $PBSSRC/src/include    cp $PBSSRC/src/scheduler.cc/samples/dec_cluster/site_resc_attr_def.ht .    cd $PBSOBJ    make clean    $PBSSRC/configure [your options] --set-sched-code=dec_cluster    make    make installRequired modifications to existing PBS configuration----------------------------------------------------There are several changes that will need to be made to the PBS configuration.This version of the custom scheduler can take advantage of the server nodesfile, which contains one line per node. (For a detailed explaination ofthe format of the "nodes" file, see the PBS Admin Guide.)1. Edit $PBSHOME/server_priv/nodes, and add one line for each execution   host, as the following example shows:   piglet   evelyn   mrjnode1   mrjnode2   mrjnode3   mrjnode42. Add the following entry to each MOM's config ($PBSHOME/mom_priv/config)   file to enable the querying of /tmp space:tmpdir !/bin/df -k /tmp |/bin/grep /tmp |/bin/awk '{printf("%f\n",$5 * 1024)}'3. Create an execution queue, that will become a holding queue from which   the scheduler will pull jobs. This queue will need certain minimum   attributes set, as indicated below:   first start the server:   #pbs_server   then change the queue attributes:   #qmgr   set queue funnel queue_type = Execution   set queue funnel resources_default.mem = 512mb   set queue funnel resources_default.ncpus = 1   set queue funnel resources_default.walltime = 00:05:00   set queue funnel enabled = True   set queue funnel started = True4. Set the default and maximum attributes for each execution queue.    It is suggested you dump the qmgr output to a file, and then edit   the file (ie  'qmgr -c "p s" > /tmp/somefile'). The example below   shows the recommmeded attributes (and changes via qmgr):   #qmgr   set queue evelyn queue_type = Execution   set queue evelyn from_route_only = True   set queue evelyn resources_max.mem = 2gb   set queue evelyn resources_max.ncpus = 8   set queue evelyn resources_max.speed = 200   set queue evelyn resources_max.walltime = 08:00:00   set queue evelyn resources_default.mem = 512mb   set queue evelyn resources_default.ncpus = 1   set queue evelyn resources_default.walltime = 00:05:00   set queue evelyn enabled = True   set queue evelyn started = True   ...Configuring the custom scheduler--------------------------------The scheduler configuration file (as discussed above) will need tobe modified for you site. Edit $PBSHOME/sched_priv/sched_config changing in particular the BATCH_QUEUES line. This should containthe list of all the queues you have defined, and the associatedexecution host, eg:Host "piglet.mrj.com" has an assocated queue named "piglet"and "evelyn.mrj.com" is fed by queue "evelyn":BATCH_QUEUES	piglet@piglet.mrj.com,evelyn@evelyn.mrj.comHowever, the full hostname is not required, so for brevity, one could enter:BATCH_QUEUES	piglet@piglet,evelyn@evelynThe FAIR_ACCESS directive will also need to be updated, as decribedin the configuration file.Review the other configuration parameter, and change any as needed.They are currently set to recommended defaults.Using the new features----------------------This scheduler supports several modifications to PBS to allow users tospecify "non-standard" requirements.  These are used as follows:To request a host of at least 100 Mhz:	qsub -l speed=100 scriptnameTo request a host with at least 350 MB free space on /tmp:	qsub -l tmpdir=350mb scriptnameTo request a host with both:	qsub -l speed=100,tmpdir=350mb scriptor	qsub -l speed=100 -l tmpdir=350mb scriptTo request a host with (for example) a "blue" featureA (a generic stringattribute):	qsub -l featureA="blue" scriptThe scheduler will ensure that the job is not placed on a host *slower*than requested. For tmpdir, the scheduler will query the mom daemon tocheck the *available* free space on /tmp. If sufficient space is there,the job will be run. For the various "featureX" attributes, the schedulerwill match user requests for these attribute against queues, attemptingto find one that matches. If a matching queue is not currently available,the sceduler will not run the job.The following attributes will be useful to your users:	speed    : MHz speed of execution host	tmpdir   : amount of /tmp space needed	ncpus    : number of CPUs needed	mem      : amount of memory needed	walltime : amount of wallclock time needed	featureA : (string)	featureB : (string)	featureC : (string)	featureD : (integer)	featureE : (integer)	featureF : (integer)	featureG : (boolean)	featureH : (boolean)	featureI : (boolean)General Notes-------------This section has some general comments about this scheduler, and thingsto be aware of.In order for the matching of "speed" to work correctly, the queues needa corresponding maximum resource limit set. E.g.:    Qmgr: set queue QUEUENAME resources_max.speed = xxxSince this version of the scheduler support the PBS nodes file, youcan use the "pbsnodes" commands to view node status, take nodes offline,etc. 	pbsnodes -l      # lists all down/offline/unavailable nodes	pbsnodes -a 	 # lists all info for all nodes	pbsnodes -o node # mark the named node OFFLINE, running jobs will			 # will continue to run on that node, but no new			 # jobs will be started on it.	pbsnodes -c node # clear or remove the OFFLINE status on node, 			 # making it available for running jobs again.The "featureX" attribute also need maximum limits set on the queues inorder for the scheduler to match on them. E.g.:    Qmgr: set queue QUEUENAME resources_max.featureA = green    Qmgr: set queue QUEUENAME resources_max.featureD = 1000    Qmgr: set queue QUEUENAME resources_max.featureG = falseThe names of the "featureX" attributes can be changed, if desired. Todo so, edit $PBSSRC/src/include/site_resc_attr_def.ht replacing thefeatureX name with the new desired name. Then edit the scheduler sothat it knows about the new names:   $PBSSRC/src/scheduler.cc/samples/dec_cluster/toolkit.h

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -