http:^^www.cs.washington.edu^homes^jlo^papers^pldi95abstract.html

来自「This data set contains WWW-pages collect」· HTML 代码 · 共 63 行

HTML
63
字号
Date: Tue, 10 Dec 1996 21:10:26 GMTServer: NCSA/1.4.2Content-type: text/htmlLast-modified: Tue, 24 Oct 1995 01:11:54 GMTContent-length: 2417<html><head><TITLE>"Improving Balanced Scheduling with Compiler Optimizations that Increase Instruction-Level Parallelism"</title></head><body><H!>Improving Balanced Scheduling with Compiler Optimizations that Increase Instruction-Level Parallelism</H1><hr><!WA0><!WA0><a href="http://www.cs.washington.edu/homes/jlo">Jack L. Lo</a> and <!WA1><!WA1><a href="http://www.cs.washington.edu/homes/eggers">Susan J. Eggers</a><hr>Traditional list schedulers order instructions based on an optimisticestimate of the load latency imposed by the hardware and thereforecannot respond to variations in memory latency caused by cache hitsand misses on non-blocking architectures.  In contrast, balancedscheduling schedules instructions based on an estimate of the amount ofinstruction-level parallelism in the program.  By scheduling independentinstructions behind loads based on what the program can provide, ratherthan what the implementation stipulates in the best case (i.e., a cachehit), balanced scheduling can hide variations in memory latencies more effectively.Since its success depends on the amount of instruction-level parallelismin the code, balanced scheduling should perform even better when moreparallelism is available.  In this study, we combine balanced schedulingwith three compiler optimizations that increase instruction-level parallelism:loop unrolling, trace scheduling and cache locality analysis.  Using codegenerated for the DEC Alpha by the Multiflow compiler, we simulated a non-blocking processor architecture that closely models the Alpha 21164.  Ourresults show that balanced scheduling benefits from all three optimizations,producing average speedups that range from 1.15 to 1.40, across the optimizations.  More importantly, because of its ability to toleratevariations in load interlocks, it improves its advantage over traditionalscheduling.  Without the optimizations, balanced scheduled code is, onaverage, 1.05 times faster than that generated by a traditional scheduler;with them, its lead increases to 1.18.<hr><i>In <i>Proceedings of the ACM SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, California, June 1995, pages 151-162.</i><br><p>To get the PostScript file, click<!WA2><!WA2><a href="http://www.cs.washington.edu/homes/jlo/papers/pldi95.ps">here</a>.<hr><address><em>jlo@cs.washington.edu </em> <br></address></body></html>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?