📄 00000045.htm
字号:
rement the corresponding module's reference count <BR>11. The child is marked as 'has not execed' p-did_exec = 0 <BR>12. The child is marked as 'not-swappable' p-swappable = 0 <BR>13. The child is put into 'uninterruptible sleep' state p-state = TASK_UNINT <BR>ERRUPTIBLE (TODO: why is this done? I think it's not needed - get rid of it, <BR> Linus confirms it is not needed) <BR>14. The child's p-flags are set according to the value of clone_flags, for t <BR>he plain fork(2) it is p-flags = PF_FORKNOEXEC. <BR>15. The childs pid p-pid is set using the fast algorithm in kernel/fork.c:ge <BR>t_pid() (TODO: lastpid_lock spinlock can be made redundant since get_pid() i <BR>s always called under big kernel lock from do_fork(), also remove flags argu <BR>ment of get_pid, patch sent to Alan on 20/06/2000 - followup later). <BR>16. The rest of the code in do_fork() initialises the rest of child's task s <BR>tructure. At the very end, the child's task structure is hashed into pidhash <BR> hashtable and the child is woken up (TODO: wake_up_process(p) sets p-state <BR>= TASK_RUNNING and adds the process to the runq, therefore we probably didn' <BR>t need to set p-state to TASK_RUNNING earlier on in do_fork()). The interest <BR>ing part is setting p-exit_signal to clone_flags & CSIGNAL which for fork(2) <BR> means just SIGCHLD and setting p-pdeath_signal to 0. The pdeath_signal is u <BR>sed when a process 'forgets' the original parent (by dying) and can be set/g <BR>et by means of PR_GET/SET_PDEATHSIG commands of prctl(2) system call (You mi <BR>ght argue that the way the value of pdeath_signal is returned via userspace <BR>pointer argument in prctl(2) is a bit silly - mea culpa, after Andries Brouw <BR>er updated the manpage it was too late to fix ;) <BR>Thus tasks are created. There are several ways for tasks to terminate: <BR>1. By making exit(2) system call <BR>2. By being delivered a signal with default disposition to die <BR>3. By being forced to die under certain exceptions <BR>4. By calling bdflush(2) with func == 1 (this is Linux-specific, for compati <BR>bility with old distributions that still had the 'update' line in /etc/initt <BR>ab - nowadays the work of update is done by kernel thread kupdate <BR>Functions implementing system calls under Linux are prefixed with 'sys_', bu <BR>t they are usually concerned only with argument checking or arch-specific wa <BR>ys to pass some information and the actual work is done by 'do_' functions. <BR>So it is with sys_exit() which calls do_exit() to do the work. Although, oth <BR>er parts of the kernel sometimes invoke sys_exit(), they should really call <BR>do_exit(). <BR>The function do_exit() is found in kernel/exit.c. The points to note about d <BR>o_exit(): <BR>· Uses global kernel lock (locks but doesn't unlock) <BR>· Calls schedule() at the end which never returns <BR>· Sets the task state to TASK_ZOMBIE <BR>· Notifies any child with current-pdeath_signal, if not 0 <BR>· Notifies the parent with a current-exit_signal, which is usually equal to <BR> SIGCHLD <BR>· Releases resources allocated by fork, closes open files etc <BR>· On architectures that use lazy FPU switching (ia64, mips, mips64, (TODO: <BR>remove 'flags' argument of sparc, sparc64) do whatever the hardware requires <BR> to pass the FPU ownership (if owned by current) to "none" <BR>2.3 Linux Scheduler <BR>The job of a scheduler is to arbitrate access to the current CPU between mul <BR>tiple processes. Scheduler is implemented in the 'main kernel file' kernel/s <BR>ched.c. The corresponding header file include/linux/sched.h is included (eit <BR>her explicitly or indirectly) by virtually every kernel source file. <BR>The fields of task structure relevant to scheduler include: <BR>· p-need_resched, set if schedule() should be invoked at the 'next opportun <BR>ity' <BR>· p-counter, number of clock ticks left to run in this scheduling slice, de <BR>cremented by timer. When goes below or equal zero is reset to 0 and p-need_r <BR>esched set. This is also sometimes called 'dynamic priority' of a process be <BR>cause it can change by itself <BR>· p-priority, static priority, only changed through well-known system calls <BR> like nice(2), POSIX.1b sched_setparam(2) or 4.4BSD/SVR4 setpriority(2) <BR>· p-rt_priority, realtime priority <BR>· p-policy, scheduling policy, specifies which scheduling class the task be <BR>longs to. Tasks can change their scheduling class using sched_setscheduler(2 <BR>) system call. The valid values are SCHED_OTHER (traditional UNIX process), <BR>SCHED_FIFO (POSIX.1b FIFO realtime process) and SCHED_RR (POSIX round-robin <BR>realtime process). One can also OR SCHED_YIELD to any of these values to sig <BR>nify that the process decided to yield the CPU, for example by calling sched <BR>_yield(2) system call. FIFO realtime process runs until either a) it blocks <BR>on I/O b) explicitly yields the CPU or c) is preempted by another realtime p <BR>rocess with a higher p-rt_priority value. SCHED_RR is same as SCHED_FIFO exc <BR>ept that when it's timeslice expires it goes back to the end of the runqueue <BR> <BR>The scheduler's algorithm is simple, despite the great apparent complexity o <BR>f the schedule() function. The function is complex because it implements thr <BR>ee scheduling algorithms in one and also because of the subtle SMP-specifics <BR>. <BR>The apparently 'useless' gotos in schedule() are there for a purpose - to ge <BR>nerate the best optimized (for i386) code. Also, note that scheduler (like m <BR>ost of the kernel) was completely rewritten for 2.4 so the discussion below <BR>does not apply to 2.2 or to any other old kernels. <BR>Let us look at the function in detail: <BR>1. if current-active_mm == NULL then something is wrong. Current process, ev <BR>en a kernel thread (current-mm == NULL) must have a valid p-active_mm at all <BR> times <BR>2. if there is something to do on tq_scheduler task queue, process it now. T <BR>ask queues provide a kernel mechanism to schedule execution of functions at <BR>a later time. We shall look at it in details elsewhere. <BR>3. initialize local variables prev and this_cpu to current task and current <BR>CPU respectively <BR>4. check if schedule() was invoked from interrupt handler (due to a bug) and <BR> panic if so <BR>5. release the global kernel lock <BR>6. if there is some work to do via softirq mechanism do it now <BR>7. initialize local pointer 'struct schedule_data *sched_data' to point to p <BR>er-CPU (cacheline-aligned to prevent cacheline ping-pong) scheduling data ar <BR>ea containing TSC value of last_schedule and the pointer to last scheduled t <BR>ask structure (TODO: sched_data is used on SMP only but why does init_idle() <BR> initialises it on UP as well?) <BR>8. runqueue_lock spinlock is taken. Note that we use spin_lock_irq() because <BR> in schedule() we guarantee that interrupts are enabled so when we unlock ru <BR>nqueue_lock we can just re-enable them instead of saving/restoring eflags (s <BR>pin_lock_irqsave/restore variant) <BR>9. task state machine: if the task is in TASK_RUNNING state it is left alone <BR>, if it is in TASK_INTERRUPTIBLE and a signal is pending then it is moved in <BR>to TASK_RUNNING state. In all other cases it is deleted from the runqueue <BR>10. next (best candidate to be scheduled) is set to the idle task of this cp <BR>u. However, the goodness of this candidate is set to a very low value of -10 <BR>00 in hope that there is someone better than that. <BR>11. if the prev (current) task is in TASK_RUNNING state, then the current go <BR>odness is set to its goodness and it is marked as a better candidate to be s <BR>cheduled than the idle task <BR>12. now the runqueue is examined and a goodness of each process that can be <BR>scheduled on this cpu is compared with current value and the process with hi <BR>ghest goodness wins. Now the concept of "can be scheduled on this cpu" must <BR>be clarified - on UP every process on the runqueue is eligible to be schedul <BR>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -