⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 http:^^www.cs.washington.edu^research^smt^papers^tlpabstract.html

📁 This data set contains WWW-pages collected from computer science departments of various universities
💻 HTML
字号:
Date: Tue, 10 Dec 1996 03:22:51 GMTServer: NCSA/1.4.2Content-type: text/html<html><head><title>"Converting Thread-Level Parallelism to Instruction-Level Parallelism via Simultaneous Multithreading"</title></head><body><h2>Converting Thread-Level Parallelism to Instruction-Level Parallelism via Simultaneous Multithreading</h2><hr><!WA0><!WA0><!WA0><!WA0><a href="http://www.cs.washington.edu/homes/jlo">Jack L. Lo</a>,<!WA1><!WA1><!WA1><!WA1><a href="http://www.cs.washington.edu/homes/eggers">Susan J. Eggers</a>, Joel S. Emer, <!WA2><!WA2><!WA2><!WA2><a href="http://www.cs.washington.edu/homes/levy">Henry M. Levy</a>,Rebecca L. Stamm, and<!WA3><!WA3><!WA3><!WA3><a href="http://www.cs.washington.edu/homes/tullsen">Dean M. Tullsen</a><hr><p>To achieve high performance, contemporary computer systems rely on two forms of parallelism: <i>instruction-level</i> parallelism (ILP) and<i>thread-level</i> parallelism (TLP).  Wide-issue superscalar processors exploit ILP by executing multiple instruction from a signel programin a single cycle.  Multiprocessors (MP) exploit TLP by executing differentthreads in parallel on different processors.  Unfortunately, both parallel-processing styles statically partition processor resources, thus preventingthem from adapting to dynamically-changing levels of TLP and ILP in a program.  With insufficient TLP, processors in an MP will be idle; with insufficient ILP, multiple-issue hardware on a superscalar is wasted.<br>This paper explores parallel processing on an alternative architecture, <i>simultaneous multithreading</i> (SMT), which allows multiple threadsto compete for and share all of the processor's resources <i>every</i> cycle.The most compelling reason for running parallel applications on an SMTprocessor is its ability to use thread-level parallelism and instruction-level parallelism interchangeably.  By permitting multiple threads to sharethe processor's functional units simultaneously, the processor can use bothILP and TLP to tolerate variations in parallelism.  When a program has onlya single thread, all of the SMT processor's resources can be dedicated to that thread; when more TLP exists, this parallelism can compensate for alack of per-thread ILP.<br>In this work, we examine two alternative on-chip parallel architectures enabledby the greatly-increased chip densities expected in the near future.  Wecompare SMT and small-scale, on-chip multiprocessors (MP) in their abilityto exploit both ILP and TLP.  First, we identify the hardware bottlenecksthat prevent multiprocessors from efficiently exploiting ILP. Then, we show that because of its dynamic resource sharing, SMT avoids theseinefficiencies and benefits from being able to run more threads on a singleprocessor.  The use of TLP is especially advantageous when per-thread ILPis limited.  The ease of adding additional thread contexts on an SMT (relativeto addition additional processors on an MP) allows simultaneous multithreadingto expose more parallelism, further increasing processor utilization andattaining a 52% average speedup (versus a four-processor, single-chip multiprocessor with comparable execution resources).<br>We also assess how the memory hierarchy is affected by the use of additionalthread-level parallelism. We show that inter-thread interference and theincreased memory requirements have small impacts on total program performanceand do not inhibit significant program speedups.<i></i><p><hr><i><br>Submitted for publication, July 1996.</i><p>To get the PostScript file, click<!WA4><!WA4><!WA4><!WA4><a href="http://www.cs.washington.edu/research/smt/papers/tlp2ilp.ps">here</a>.<hr><address><em>jlo@cs.washington.edu </em> <br></address></body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -