首页 › 资源下载 › 其他 › This data set contai › 源码查看

http:^^www.cs.washington.edu^homes^jlo^papers^pldi95abstract.html

来自「This data set contains WWW-pages collect」· HTML 代码 · 共 63 行

HTML

63 行

Date: Tue, 10 Dec 1996 21:10:26 GMTServer: NCSA/1.4.2Content-type: text/htmlLast-modified: Tue, 24 Oct 1995 01:11:54 GMTContent-length: 2417<html><head><TITLE>"Improving Balanced Scheduling with Compiler Optimizations that Increase Instruction-Level Parallelism"</title></head><body><H!>Improving Balanced Scheduling with Compiler Optimizations that Increase Instruction-Level Parallelism</H1><hr><!WA0><!WA0><a href="http://www.cs.washington.edu/homes/jlo">Jack L. Lo</a> and <!WA1><!WA1><a href="http://www.cs.washington.edu/homes/eggers">Susan J. Eggers</a><hr>Traditional list schedulers order instructions based on an optimisticestimate of the load latency imposed by the hardware and thereforecannot respond to variations in memory latency caused by cache hitsand misses on non-blocking architectures.  In contrast, balancedscheduling schedules instructions based on an estimate of the amount ofinstruction-level parallelism in the program.  By scheduling independentinstructions behind loads based on what the program can provide, ratherthan what the implementation stipulates in the best case (i.e., a cachehit), balanced scheduling can hide variations in memory latencies more effectively.Since its success depends on the amount of instruction-level parallelismin the code, balanced scheduling should perform even better when moreparallelism is available.  In this study, we combine balanced schedulingwith three compiler optimizations that increase instruction-level parallelism:loop unrolling, trace scheduling and cache locality analysis.  Using codegenerated for the DEC Alpha by the Multiflow compiler, we simulated a non-blocking processor architecture that closely models the Alpha 21164.  Ourresults show that balanced scheduling benefits from all three optimizations,producing average speedups that range from 1.15 to 1.40, across the optimizations.  More importantly, because of its ability to toleratevariations in load interlocks, it improves its advantage over traditionalscheduling.  Without the optimizations, balanced scheduled code is, onaverage, 1.05 times faster than that generated by a traditional scheduler;with them, its lead increases to 1.18.<hr><i>In <i>Proceedings of the ACM SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, California, June 1995, pages 151-162.</i><br><p>To get the PostScript file, click<!WA2><!WA2><a href="http://www.cs.washington.edu/homes/jlo/papers/pldi95.ps">here</a>.<hr><address><em>jlo@cs.washington.edu </em> <br></address></body></html>

http:^^www.cs.washington.edu^homes^jlo^papers^pldi95abstract.html - 源码说明

本页面展示了「This data set contains WWW-pages collected from computer science departments of various universities」中的 http:^^www.cs.washington.edu^homes^jlo^papers^pldi95abstract.html 源码文件，采用 HTML 编程语言编写，共 63 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫开发者社区收录了大量与数据集相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?