📄 readme.440-ddr-performance
字号:
AMCC suggested to set the PMU bit to 0 for best performace on thePPC440 DDR controller. The 440er common DDR setup files (sdram.c &spd_sdram.c) are changed accordingly. So all 440er boards usingthese setup routines will automatically receive this performanceincrease.Please see below some benchmarks done by AMCC to demonstrate thisperformance changes:----------------------------------------SDRAM0_CFG0[PMU] = 1 (U-boot default for Bamboo, Yosemite and Yellowstone)----------------------------------------Stream benchmark results-------------------------------------------------------------This system uses 8 bytes per DOUBLE PRECISION word.-------------------------------------------------------------Array size = 2000000, Offset = 0Total memory required = 45.8 MB.Each test is run 10 times, but onlythe *best* time for each is used.-------------------------------------------------------------Your clock granularity/precision appears to be 1 microseconds.Each test below will take on the order of 112345 microseconds. (= 112345 clock ticks)Increase the size of the arrays if this shows that you are not gettingat least 20 clock ticks per test.-------------------------------------------------------------WARNING -- The above is only a rough guideline.For best results, please be sure you know the precision of your systemtimer.-------------------------------------------------------------Function Rate (MB/s) RMS time Min time Max timeCopy: 256.7683 0.1248 0.1246 0.1250Scale: 246.0157 0.1302 0.1301 0.1302Add: 255.0316 0.1883 0.1882 0.1885Triad: 253.1245 0.1897 0.1896 0.1899TTCP Benchmark Resultsttcp-t: socketttcp-t: connectttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000 tcp ->localhostttcp-t: 16777216 bytes in 0.28 real seconds = 454.29 Mbit/sec +++ttcp-t: 2048 I/O calls, msec/call = 0.14, calls/sec = 7268.57ttcp-t: 0.0user 0.1sys 0:00real 60% 0i+0d 0maxrss 0+2pf 3+1506csw----------------------------------------SDRAM0_CFG0[PMU] = 0 (Suggested modification)Setting PMU = 0 provides a noticeable performance improvement *2% to5% improvement in memory performance.*Improves the Mbit/sec for TTCP benchmark by almost 76%.----------------------------------------Stream benchmark results-------------------------------------------------------------This system uses 8 bytes per DOUBLE PRECISION word.-------------------------------------------------------------Array size = 2000000, Offset = 0Total memory required = 45.8 MB.Each test is run 10 times, but onlythe *best* time for each is used.-------------------------------------------------------------Your clock granularity/precision appears to be 1 microseconds.Each test below will take on the order of 120066 microseconds. (= 120066 clock ticks)Increase the size of the arrays if this shows that you are not gettingat least 20 clock ticks per test.-------------------------------------------------------------WARNING -- The above is only a rough guideline.For best results, please be sure you know the precision of your systemtimer.-------------------------------------------------------------Function Rate (MB/s) RMS time Min time Max timeCopy: 262.5167 0.1221 0.1219 0.1223Scale: 258.4856 0.1238 0.1238 0.1240Add: 262.5404 0.1829 0.1828 0.1831Triad: 266.8594 0.1800 0.1799 0.1802TTCP Benchmark Resultsttcp-t: socketttcp-t: connectttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000 tcp ->localhostttcp-t: 16777216 bytes in 0.16 real seconds = 804.06 Mbit/sec +++ttcp-t: 2048 I/O calls, msec/call = 0.08, calls/sec = 12864.89ttcp-t: 0.0user 0.0sys 0:00real 46% 0i+0d 0maxrss 0+2pf 120+1csw2006-07-28, Stefan Roese <sr@denx.de>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -