📄 node21.html

📁 Design and building parallel program
💻 HTML
📖 第 1 页 / 共 2 页
字号:
上一页 12
<A NAME=1914>&#160;</A>
<em> breadth-first
 </em> fashion.  Notice that only tasks on the
<A NAME=1916>&#160;</A>
wavefront can execute concurrently.  We also need to address the issue
of how to manage the <em> A</em>
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img305.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img305.gif"> value, which must be accessed by
all tasks.  For now, we assume that it is encapsulated in a single
task with which other tasks will communicate.
<P>
A quick review using the design checklist of Section <A HREF="node16.html#secrules1" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node16.html#secrules1">2.2.3</A>
reveals one deficiency in this design.  The breadth-first exploration
strategy is likely to decrease performance dramatically by delaying
discovery of solution nodes and hence reducing the amount of pruning
that occurs, thereby leading to considerable redundant
computation.  We must bear this issue in mind in subsequent design
phases.
<P>
<H4><A NAME=SECTION02372020000000000000> Communication.</A></H4>
<P>
<A NAME=1921>&#160;</A>
In a parallel implementation of simple search
(Algorithm <A HREF="node10.html#algsearch" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node10.html#algsearch">1.1</A>), tasks can execute independently and need
communicate only to report solutions.  In contrast, branch-and-bound
search requires communication during execution in order to obtain and update
the search bound <em> A</em>
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img306.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img306.gif">.  In designing a communication
structure to achieve this goal, we need to trade off the benefits of
frequent accesses to a centralized <em> A</em>
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img307.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img307.gif"> value (which tends
to reduce the amount of the search tree that must be explored) against
communication costs.
<P>
One approach is to encapsulate responsibility for maintaining
<em> A</em>
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img308.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img308.gif"> in a centralized task, with which each task
communicates when a solution is produced or a bound is required.  This
approach is simple and may even be efficient if communication is
cheap, evaluating a node is expensive, and the number of
processors is not too large.  However, the centralized approach is
inherently nonscalable.
Since the manager must take a certain amount
of time to process a request, the maximum rate at which it can service
requests, and hence the maximum number of tasks that can execute
concurrently, is bounded.
<P>
Various refinements to this centralized scheme can be imagined.  We
can modify Algorithm <A HREF="node21.html#algbbs" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node21.html#algbbs">2.2</A> to check <em> A</em>
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img309.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img309.gif"> only
periodically, for example when a depth counter incremented on each
recursive call is an integer multiple of a specified frequency
parameter.  Or, we can partition the tree into subtrees, each with its
own <em> A</em>
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img310.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img310.gif"> submanager, and organize periodic exchanges of
information between these submanagers.  For example, submanagers can
perform broadcast operations when they discover significantly better
solutions.
<P>
<H4><A NAME=SECTION02372030000000000000> Agglomeration.</A></H4>
<P>
In the agglomeration phase of the design process we start to address
practical issues relating to performance on target computers.  In the
floorplan optimization problem, this means we must address two potential
<A NAME=1935>&#160;</A>
deficiencies of the fine-grained algorithm that we have developed.
The first will be familiar from earlier problems, that is, the cost of
creating a large number of fine-grained tasks.  This can be addressed
using agglomeration, unless we believe that node evaluation is
sufficiently expensive and task creation sufficiently cheap for the
fine-grained algorithm to be efficient.  For example, we can create
one task for each <tt> search</tt> call in the <tt> foreach</tt> statement of
<A NAME=1938>&#160;</A>
Algorithm <A HREF="node21.html#algbbs" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node21.html#algbbs">2.2</A> until we reach a specified depth in the tree,
and then switch to a depth-first strategy, thereby creating a single task that
evaluates <tt> search</tt> calls in sequence (Figure <A HREF="node21.html#figsearch" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node21.html#figsearch">2.30</A>).
If the switch to depth-first search is performed at depth <em> D</em>
 and
cell <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img311.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img311.gif"> has <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img312.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img312.gif"> implementations, then in the absence of
pruning this technique creates <IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img313.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img313.gif"> tasks.
<P>
<P><A NAME=2936>&#160;</A><IMG BORDER=0 ALIGN=BOTTOM ALT="" SRC="img314.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img314.gif">
<BR><STRONG>Figure 2.30:</STRONG> <em> Increasing granularity in a search problem.  In this
figure, we agglomerate by switching to a sequential search at level
two in the search tree.  A task is created for each subtree rooted at
level two.</em><A NAME=figsearch>&#160;</A><BR>
<P>
<P>
The second potential deficiency is more subtle and relates to the
scheduling of tasks rather than to their creation.  In the absence of
explicit programmer control, we can assume that the tasks created to
evaluate search tree nodes will execute either in the order that they
are created or perhaps in a random order. In either case, the search
tree tends to be explored in a breadth-first fashion.  This is
undesirable because it tends to reduce the effectiveness of pruning
and hence cause redundant computation.  The solution to this problem
is to control the order in which search tree nodes are explored.  That
is, we must implement a task-scheduling algorithm.  Because this is
really a mapping issue, we discuss it under ``Mapping.''
<P>
<H4><A NAME=SECTION02372040000000000000> Mapping.</A></H4>
<P>
<A NAME=1952>&#160;</A>
Recall that when we use a task-scheduling strategy, tasks (search tree
<A NAME=1953>&#160;</A>
nodes) become ``problems'' to be executed by one of a smaller number
of ``worker'' tasks, typically one per processor.  Workers generate
new search problems as they expand nodes, and request new search
problems each time they complete previously assigned problems.
Requests can be handled using a centralized or decentralized strategy.
<P>
We can imagine a variety of alternative task-scheduling schemes for
the floorplan optimization problem.  One approach works in conjunction
with the agglomeration scheme of Figure <A HREF="node21.html#figsearch" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node21.html#figsearch">2.30</A>.  A central
manager first constructs a number of coarse-grained tasks, by
exploring the search tree to depth <em> D</em>
.  These tasks are then
assigned to idle workers in a demand-driven manner.  Because each task
can be represented by a short vector representing the path taken to
its position in the tree, the data movement costs associated with this
scheme are not high.  Furthermore, because each processor executes one
subtree at a time in a depth-first fashion, pruning is effective.
<P>
An interesting variant of this approach combines elements of both
redundant work and cyclic mapping to avoid the need for a central
manager.  Every worker expands the tree to depth <em> D</em>
.  Then, each
worker takes responsibility for a disjoint subset of the tasks
generated.  (This subset could be identified using a cyclic allocation
strategy, for example.)  Only if a worker becomes idle does it ask
other workers for tasks.
<P>
A third strategy, more complex but also more general, is
<A NAME=1957>&#160;</A>
initially to allocate the root node to a single worker.  Load
balancing is then achieved by causing workers with empty queues to
request problems from other workers.  Each worker can then enforce a
local depth-first search strategy, and hence increase the amount of
pruning, by ordering its queue of search problems according to their
depth in the tree.  This method allows the worker to select problems far from the
root for local execution and problems nearer to the root to hand to
other workers.
<P>
Our choice of task scheduling strategy will depend on characteristics
of our problem and target computer and can be determined by analysis
and experiment.  Notice that the communication structures used for
task scheduling can be integrated with those proposed earlier for
maintaining <em> A</em>
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img315.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img315.gif">.  For example, a central manager used for
task scheduling can also maintain and distribute an up-to-date search
bound with each task.  In decentralized schemes, the worker tasks that
execute search problems can broadcast improved search bound values to
other workers.
<P>
<H2><A NAME=SECTION02373000000000000000>2.7.3 Floorplan Summary</A></H2>
<P>
The parallel algorithm designed in this case study is certainly more
complex, and perhaps less obvious, than that developed for the
atmosphere model.  It is clear from the start that functional
decomposition techniques should be used to define tasks, that
responsibility for maintaining <em> A</em>
<IMG BORDER=0 ALIGN=MIDDLE ALT="" SRC="img316.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/img316.gif"> should be isolated from
the rest of the computation, and that we can increase task granularity
by switching from a parallel to a sequential evaluation strategy at a
specified depth in the search tree.  If we were concerned with
parallelizing simple search, the design might be complete at this
stage.  However, the need to support pruning requires that we proceed
with further refinements.  In particular, we introduce a
task-scheduling algorithm so that we can pursue depth-first search on
each processor while exposing higher-level search tree nodes for idle
workers.
<A NAME=1963>&#160;</A>
<P>

<BR> <HR><a href="msgs0.htm#2" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#2"><img ALIGN=MIDDLE src="asm_color_tiny.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/asm_color_tiny.gif" alt="[DBPP]"></a>    <A NAME=tex2html2095 HREF="node20.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node20.html"><IMG ALIGN=MIDDLE ALT="previous" SRC="previous_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/previous_motif.gif"></A> <A NAME=tex2html2103 HREF="node22.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node22.html"><IMG ALIGN=MIDDLE ALT="next" SRC="next_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/next_motif.gif"></A> <A NAME=tex2html2101 HREF="node14.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node14.html"><IMG ALIGN=MIDDLE ALT="up" SRC="up_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/up_motif.gif"></A> <A NAME=tex2html2105 HREF="node1.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node1.html"><IMG ALIGN=MIDDLE ALT="contents" SRC="contents_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/contents_motif.gif"></A> <A NAME=tex2html2106 HREF="node133.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node133.html"><IMG ALIGN=MIDDLE ALT="index" SRC="index_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/index_motif.gif"></A> <a href="msgs0.htm#3" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#3"><img ALIGN=MIDDLE src="search_motif.gif" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/search_motif.gif" alt="[Search]"></a>   <BR>
<B> Next:</B> <A NAME=tex2html2104 HREF="node22.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node22.html">2.8 Case Study: Computational Chemistry</A>
<B>Up:</B> <A NAME=tex2html2102 HREF="node14.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node14.html">2 Designing Parallel Algorithms</A>
<B> Previous:</B> <A NAME=tex2html2096 HREF="node20.html" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/node20.html">2.6 Case Study: Atmosphere Model</A>
<BR><HR><P>
<P><ADDRESS>
<I>&#169 Copyright 1995 by <A href="msgs0.htm#6" tppabs="http://www.dit.hcmut.edu.vn/books/system/par_anl/tppmsgs/msgs0.htm#6">Ian Foster</a></I>
</ADDRESS>
</BODY>
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -