⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ffmpeg_powerpc_performance_evaluation_howto.txt

📁 ffmpeg源码分析
💻 TXT
字号:
FFmpeg & evaluating performance on the PowerPC Architecture HOWTO(c) 2003-2004 Romain Dolbeau <romain@dolbeau.org>I - IntroductionThe PowerPC architecture and its SIMD extension AltiVec offer someinteresting tools to evaluate performance and improve the code.This document tries to explain how to use those tools with FFmpeg.The architecture itself offers two ways to evaluate the performance ofa given piece of code:1) The Time Base Registers (TBL)2) The Performance Monitor Counter Registers (PMC)The first ones are always available, always active, but they're not veryaccurate: the registers increment by one every four *bus* cycles. Onmy 667 Mhz tiBook (ppc7450), this means once every twenty *processor*cycles. So we won't use that.The PMC are much more useful: not only can they report cycle-accuratetiming, but they can also be used to monitor many other parameters,such as the number of AltiVec stalls for every kind of instruction,or instruction cache misses. The downside is that not all processorssupport the PMC (all G3, all G4 and the 970 do support them), andthey're inactive by default - you need to activate them with adedicated tool. Also, the number of available PMC depends on theprocesor: the various 604 have 2, the various 75x (aka. G3) have 4,and the various 74xx (aka G4) have 6.*WARNING*: The PowerPC 970 is not very well documented, and its PMCregisters are 64 bits wide. To properly notify the code, you *must*tune for the 970 (using --tune=970), or the code will assume 32 bitregisters.II - Enabling FFmpeg PowerPC performance supportThis needs to be done by hand. First, you need to configure FFmpeg asusual, but add the "--powerpc-perf-enable" option. For instance:#####./configure --prefix=/usr/local/ffmpeg-cvs --cc=gcc-3.3 --tune=7450 --powerpc-perf-enable#####This will configure FFmpeg to install inside /usr/local/ffmpeg-cvs,compiling with gcc-3.3 (you should try to use this one or a newergcc), and tuning for the PowerPC 7450 (i.e. the newer G4; as a rule ofthumb, those at 550Mhz and more). It will also enable the PMC.You may also edit the file "config.h" to enable the following line:#####// #define ALTIVEC_USE_REFERENCE_C_CODE 1#####If you enable this line, then the code will not make use of AltiVec,but will use the reference C code instead. This is useful to compareperformance between two versions of the code.Also, the number of enabled PMC is defined in "libavcodec/ppc/dsputil_ppc.h":######define POWERPC_NUM_PMC_ENABLED 4#####If you have a G4 CPU, you can enable all 6 PMC. DO NOT enable morePMC than available on your CPU!Then, simply compile FFmpeg as usual (make && make install).III - Using FFmpeg PowerPC performance supportThis FFmeg can be used exactly as usual. But before exiting, FFmpegwill dump a per-function report that looks like this:#####PowerPC performance report Values are from the PMC registers, and represent whatever the registers are set to record. Function "gmc1_altivec" (pmc1):        min: 231        max: 1339867        avg: 558.25 (255302) Function "gmc1_altivec" (pmc2):        min: 93        max: 2164        avg: 267.31 (255302) Function "gmc1_altivec" (pmc3):        min: 72        max: 1987        avg: 276.20 (255302)(...)#####In this example, PMC1 was set to record CPU cycles, PMC2 was set torecord AltiVec Permute Stall Cycles, and PMC3 was set to record AltiVecIssue Stalls.The function "gmc1_altivec" was monitored 255302 times, and theminimum execution time was 231 processor cycles. The max and averagearen't much use, as it's very likely the OS interrupted execution forreasons of its own :-(With the exact same settings and source file, but using the reference Ccode we get:#####PowerPC performance report Values are from the PMC registers, and represent whatever the registers are set to record. Function "gmc1_altivec" (pmc1):        min: 592        max: 2532235        avg: 962.88 (255302) Function "gmc1_altivec" (pmc2):        min: 0        max: 33        avg: 0.00 (255302) Function "gmc1_altivec" (pmc3):        min: 0        max: 350        avg: 0.03 (255302)(...)#####592 cycles, so the fastest AltiVec execution is about 2.5x faster thanthe fastest C execution in this example. It's not perfect but it's notbad (well I wrote this function so I can't say otherwise :-).Once you have that kind of report, you can try to improve things byfinding what goes wrong and fixing it; in the example above, oneshould try to diminish the number of AltiVec stalls, as this *may*improve performance.IV) Enabling the PMC in Mac OS XThis is easy. Use "Monster" and "monster". Those tools come fromApple's CHUD package, and can be found hidden in the developer website & FTP site. "MONster" is the graphical application, use it togenerate a config file specifying what each register shouldmonitor. Then use the command-line application "monster" to use thatconfig file, and enjoy the results.Note that "MONster" can be used for many other things, but it'sdocumented by Apple, it's not my subject.V) Enabling the PMC on LinuxI don't know how to do it, sorry :-) Any idea very much welcome.--Romain Dolbeau<romain@dolbeau.org>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -