📄 index(4).html

📁 Pthread lib库完整说明文档
💻 HTML
📖 第 1 页 / 共 5 页
字号:
<LI>Flynn's taxonomy distinguishes multi-processor computer architectures     according    to how they can be classified along the two independent dimensions of    <B><I>Instruction</I></B> and <B><I>Data</I></B>.  Each of these dimensions     can have only one of two possible states:  <B><I>Single</I></B> or <B><I>Multiple</I></B>.<P><LI>The matrix below defines the 4 possible classifications according to Flynn.<P><TABLE BORDER=1 CELLSPACING=0 CELLPADDING=5><TR ALIGN=center><TD><H4><FONT COLOR=blue>S I S D  </FONT><P>Single Instruction, Single Data</H4></TD><TD><H4><FONT COLOR=blue>S I M D  </FONT><P>Single Instruction, Multiple Data</H4></TD></TR><TR ALIGN=center><TD><H4><FONT COLOR=blue>M I S D </FONT><P>Multiple Instruction, Single Data</H4></TD><TD><H4><FONT COLOR=blue>M I M D </FONT><P>Multiple Instruction, Multiple Data</H4></TD></TR></TABLE></UL><P><TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0> <TR VALIGN=top><TD><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>Single Instruction, Single Data (SISD):</SPAN><UL><LI>A serial (non-parallel) computer<LI>Single instruction: only one instruction stream is    being acted on by the CPU during any one clock cycle<LI>Single data: only one data stream is being used as input during any one clock cycle<LI>Deterministic execution     <LI>This is the oldest and until recently, the most prevalent form of computer<LI>Examples: most PCs, single CPU workstations and mainframes </UL></TD><TD><IMG SRC=images/sisd.gif WIDTH=188 HEIGHT=224 BORDER=0 HSPACE=10 ALT='SISD'></TD></TR></TABLE><TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0> <TR VALIGN=top><TD><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>Single Instruction, Multiple Data (SIMD):</SPAN><UL><LI>A type of parallel computer<LI>Single instruction: All processing units execute the same instruction at        any given clock cycle<LI>Multiple data: Each processing unit can operate on a different data element<LI>This type of machine typically has an instruction dispatcher, a very    high-bandwidth internal network, and a very large array of very     small-capacity instruction units.<LI>Best suited for specialized problems characterized by a high degree of     regularity,such as image processing.<LI>Synchronous (lockstep) and deterministic execution<LI>Two varieties: Processor Arrays and Vector Pipelines<LI>Examples:     <UL TYPE=circle>    <LI>Processor Arrays: Connection Machine CM-2, Maspar MP-1, MP-2     <LI>Vector Pipelines: IBM 9000, Cray C90, Fujitsu VP, NEC SX-2,         Hitachi S820    </UL></UL></TD><TD><BR><BR><IMG SRC=images/simd.gif WIDTH=438 HEIGHT=245 BORDER=0 HSPACE=10 ALT='SIMD'></TD></TR></TABLE><P><TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0> <TR VALIGN=top><TD><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>Multiple Instruction, Single Data (MISD):</SPAN><UL><LI>A single data stream is fed into multiple processing units. <LI>Each processing unit operates on the data independently via independent    instruction streams.<LI>Few actual examples of this class of parallel computer have ever existed.    One is the experimental Carnegie-Mellon C.mmp computer (1971).<LI>Some conceivable uses might be:    <UL>    <LI>multiple frequency filters operating on a single signal stream    <LI>multiple cryptography algorithms attempting to crack a single coded         message.    </UL></UL></TD><TD><BR><BR><IMG SRC=images/misd.gif WIDTH=438 HEIGHT=207 BORDER=0 HSPACE=10 ALT='MISD'></TD></TR></TABLE><P><P><TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0> <TR VALIGN=top><TD><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>Multiple Instruction, Multiple Data (MIMD):</SPAN><UL><LI>Currently, the most common type of parallel computer. Most modern    computers fall into this category.<LI>Multiple Instruction: every processor may be executing a different     instruction stream<LI>Multiple Data: every processor may be working with a different data     stream<LI>Execution can be synchronous or asynchronous, deterministic or     non-deterministic<LI>Examples: most current supercomputers, networked parallel computer     "grids" and multi-processor SMP computers - including some types of PCs.</UL></TD><TD><BR><BR><IMG SRC=images/mimd.gif WIDTH=438 HEIGHT=245 BORDER=0 HSPACE=10 ALT='MIMD'></TD></TR></TABLE><!--========================================================================--><P><A NAME=Terminology> <BR><BR> </A><TABLE BORDER=1 CELLPADDING=5 CELLSPACING=0 WIDTH=100%><TR><TD BGCOLOR=#98ABCE><SPAN class=heading1>Concepts and Terminology</SPAN></TD></TD></TR></TABLE><H2>Some General Parallel Terminology</H2>Like everything else, parallel computing has its own "jargon".  Some of the more commonly used terms associated with parallel computing are listed below.Most of these will be discussed in more detail later.<P><DL><DT><B>Task </B><DD>A logically discrete section of computational work.  A task is typically a     program or program-like set of instructions that is executed by a processor.<P><DT><B>Parallel Task </B><DD>A task that can be executed by multiple processors safely (yields correct     results)<P><DT><B>Serial Execution </B><DD>Execution of a program sequentially, one statement at a time.  In the     simplest sense, this is what happens on a one processor machine. However,     virtually all parallel tasks will have sections of a parallel program that     must be executed serially.<P><DT><B>Parallel Execution </B><DD>Execution of a program by more than one task, with each task being able to    execute the same or different statement at the same moment in time.   <P><DT><B>Shared Memory </B><DD>From a strictly hardware point of view, describes a computer architecture    where all processors have direct (usually bus based) access to common     physical memory.  In a programming sense, it describes a model where    parallel tasks all have the same "picture" of memory and can directly    address and access the same logical memory locations regardless     of where the physical memory actually exists.<P><DT><B>Distributed Memory </B><DD>In hardware, refers to network based memory access for physical memory that     is not common. As a programming model, tasks can only logically "see"     local machine memory and must use communications to access memory on other    machines where other tasks are executing.<P><DT><B>Communications </B><DD>Parallel tasks typically need to exchange data.  There are several ways this can beaccomplished, such as through a shared memory bus or over a network, however the actual event of data exchange is commonly referred to as communications regardless of the methodemployed.<P><DT><B>Synchronization </B><DD>The coordination of parallel tasks in real time, very often associated withcommunications. Often implemented by establishing a synchronization point within anapplication where a taskmay not proceed further until another task(s) reaches the same or logically equivalentpoint.<P>Synchronization usually involves waiting by at least one task, and can therefore causea parallel application's wall clock execution time to increase.<P><DT><B>Granularity </B><DD> In parallel computing, granularity is a qualitative measure of the ratio    of computation to communication.     <UL>    <LI><B><I>Coarse: </I></B> relatively large amounts of computational work          are done between communication events    <LI><B><I>Fine:</I></B> relatively small amounts of computational work are         done between communication events    </UL><P><DT><B>Observed Speedup </B><DD>Observed speedup of a code which has been parallelized, defined as:<P><TABLE BORDER=1 CELLPADDING=5 CELLSPACING=0><TR><TD><FONT FACE=courier>wall-clock time of serial execution<HR>wall-clock time of parallel execution</FONT></TD></TR></TABLE><P>One of the simplest and most widely used indicators for a parallel program's performance.<P><DT><B>Parallel Overhead </B><DD>The amount of time required to coordinate parallel tasks, as opposed to         doing useful work. Parallel overhead can include factors such as:    <UL>    <LI>Task start-up time    <LI>Synchronizations    <LI>Data communications    <LI>Software overhead imposed by parallel compilers, libraries, tools,         operating system, etc.    <LI>Task termination time    </UL><P><DT><B>Massively Parallel </B><DD>Refers to the hardware that comprises a given parallel system - having many processors.The meaning of many keeps increasing, but currently BG/L pushes this numberto 6 digits.<P><DT><B>Scalability </B><DD>Refers to a parallel system's (hardware and/or software) ability to demonstratea proportionate increase in parallel speedup with the addition of more processors.Factors that contribute to scalability include:    <UL>    <LI>Hardware - particularly memory-cpu bandwidths and network communications    <LI>Application algorithm    <LI>Parallel overhead related    <LI>Characteristics of your specific application and coding    </UL></DL><!--========================================================================--><P><A NAME=MemoryArch> <BR><BR> </A><A NAME=SharedMemory> </A><TABLE BORDER=1 CELLPADDING=5 CELLSPACING=0 WIDTH=100%><TR><TD BGCOLOR=#98ABCE><SPAN class=heading1>Parallel Computer Memory Architectures</SPAN></TD></TD></TR></TABLE><H2>Shared Memory</H2><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>General Characteristics:</SPAN><UL><P><LI>Shared memory parallel computers vary widely, but generally have in common     the ability for all processors to access all memory as global address space. <P><IMG SRC=images/shared_mem.gif WIDTH=414 HEIGHT=285 BORDER=0 HSPACE=10 ALT='Shared memory architecture'><P><LI>Multiple processors can operate independently but share the same memory     resources.<P><LI>Changes in a memory location effected by one processor are visible to all    other processors.<P><LI>Shared memory machines can be divided into two main classes based upon     memory access times: <B><I>UMA</I></B> and <B><I>NUMA</I></B>.</UL><P><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>Uniform Memory Access (UMA):</SPAN>    <UL>    <LI>Most commonly represented today by Symmetric Multiprocessor (SMP)         machines    <LI>Identical processors     <LI>Equal access and access times to memory     <LI>Sometimes called CC-UMA - Cache Coherent UMA.        Cache coherent means if one processor updates a location in shared         memory, all         the other processors know about the update.  Cache coherency is         accomplished at the hardware level.     </UL><P><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>Non-Uniform Memory Access (NUMA):</SPAN>    <UL>    <LI>Often made by physically linking two or more SMPs     <LI>One SMP can directly access memory of another SMP     <LI>Not all processors have equal access time to all memories     <LI>Memory access across link is slower    <LI>If cache coherency is maintained, then may also be called CC-NUMA -         Cache Coherent NUMA     </UL><P><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>Advantages:</SPAN>    <UL>    <LI>Global address space provides a user-friendly programming perspective         to memory    <LI>Data sharing between tasks is both fast and uniform due to the proximity         of memory to CPUs    </UL><P><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>Disadvantages:</SPAN>    <UL>    <LI>Primary disadvantage is the lack of scalability between memory and CPUs.        Adding more CPUs can geometrically increases traffic on the shared        memory-CPU path, and for cache coherent systems, geometrically increase         traffic associated with cache/memory management.     <LI>Programmer responsibility for synchronization constructs that insure        "correct" access of global memory.     <LI>Expense: it becomes increasingly difficult and expensive to design and        produce shared memory machines with ever increasing numbers of         processors.    </UL><!--========================================================================--><P><A NAME=DistributedMemory> <BR><BR> </A><TABLE BORDER=1 CELLPADDING=5 CELLSPACING=0 WIDTH=100%><TR><TD BGCOLOR=#98ABCE><SPAN class=heading1>Parallel Computer Memory Architectures</SPAN></TD></TD></TR></TABLE><H2>Distributed Memory</H2><IMG SRC=../images/arrowBullet.gif ALIGN=top HSPACE=3><SPAN CLASS=heading3>General Characteristics:</SPAN><UL><P><LI>Like shared memory systems, distributed memory systems vary widely but     share a common characteristic. Distributed memory systems require a     communication network to connect inter-processor memory.
💿 文件大小 237 K
👤 上传用户 sy361
📂 所属分类其他书籍
🏷️ 相关标签

#Pthread #lib #文档
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -