📄 fpops.html

📁 关于ARM汇编的非常好的教程
💻 HTML
📖 第 1 页 / 共 2 页
字号:
上一页 12
<p><a name="binop"></a><br>The binary operations are...<br><code>ADF -</code> Add<br><code>DVF -</code> Divide<br><code>FDV -</code> Fast Divide - only defined to work with single precision<br><code>FML -</code> Fast Multiply - only defined to work with single precision<br><code>FRD -</code> Fast Reverse Divide - only defined to work with single precision<br><code>MUF -</code> Multiply<br><code>POL -</code> Polar Angle<br><code>POW -</code> Power<br><code>RDF -</code> Reverse Divide<br><code>RMF -</code> Remainder<br><code>RPW -</code> Reverse Power<br><code>RSF -</code> Reverse Subtract<br><code>SUF -</code> Subtract<br><p><a name="unop"></a><br>The unary operations are...<br><code>ABS -</code> Absolute Value<br><code>ACS -</code> Arc Cosine<br><code>ASN -</code> Arc Sine<br><code>ATN -</code> Arc Tangent<br><code>COS -</code> Cosine<br><code>EXP -</code> Exponent<br><code>LOG -</code> Logarithm to base 10<br><code>LGN -</code> Logarithm to base e<br><code>MVF -</code> Move<br><code>MNF -</code> Move Negated<br><code>NRM -</code> Normalise<br><code>RND -</code> Round to integral value<br><code>SIN -</code> Sine<br><code>SQT -</code> Square Root<br><code>TAN -</code> Tangent<br><code>URD -</code> Unnormalised Round<br><p><a name="cmf"></a><br><code>CMF&lt;condition&gt;&lt;precision&gt;&lt;rounding&gt; &lt;fp register 1&gt;, &lt;fp register 2&gt;</code><br>Compare FP register 2 with FP register 1.<br>The varient CMFE compares with exception.<p><a name="cnf"></a><br><code>CNF&lt;condition&gt;&lt;precision&gt;&lt;rounding&gt; &lt;fp register 1&gt;, &lt;fp register 2&gt;</code><br>Compare FP register 2 with the negative of FP register 1.<br>The varient CMFE compares with exception.<p>Compares are provided with and without the exception that could arise if the numbers areunordered (ie one or both of them is not-a-number). To comply with IEEE 754, the CMF instructionshould be used to test for equality (ie when a BEQ or BNE is used afterwards) or to test forunorderedness (in the V flag). The CMFE instruction should be used for all other tests (BGT,BGE, BLT, BLE afterwards).<p>&nbsp;<p>When the AC bit in the FPSR is clear, the ARM flags N, Z, C, V refer to the following aftercompares:<br><code>N = </code>Less than<br><code>Z = </code>Equal<br><code>C = </code>Greater than, or equal<br><code>V = </code>Unordered<p>When the AC bit in the FPSR is clear, the ARM flags N, Z, C, V refer to the following aftercompares:<br><code>N = </code>Less than<br><code>Z = </code>Equal<br><code>C = </code>Greater than, or equal<br><code>V = </code>Unordered<p>And when the AC bit is set, the flags refer to:<br><code>N = </code>Less than<br><code>Z = </code>Equal<br><code>C = </code>Greater than, or equal, or unordered<br><code>V = </code>Unordered<p>&nbsp;<p>In APCS code with <i>objasm</i>, to store a floating point value, you would use the directiveDCF. You append 'S' for single precision, and 'D' for double.<p>&nbsp;<p>&nbsp;<p>Here is a brief example. We MUL two numbers, but use the floating point unit instead of the ARM'smultiplication instruction. This could be modified to multiply two floating point numbers, andgive a floating point response, but as it is only a short example, it will simply use twointegers.<p><pre>REM &gt;fpmulREMREM Short example to multiply two integers via theREM floating point unit. Totally pointless, but...DIM code% 20FOR loop% = 0 TO 2 STEP 2  P% = code%  [  OPT loop%   .multiply     FLTS   F0, R0     FLTS   F1, R1     FMLS   F2, F0, F1     FIXS   R0, F2     MOVS   PC, R14  ]NEXTINPUT &quot;First number  : &quot;one%INPUT &quot;Second number : &quot;two%A% = one%B% = two%result% = USR(multiply)PRINT &quot;The result is &quot;+STR$(result%)END</pre>There is no option to download this program, as standard BASIC won't touch it. However, you caninclude FP statements if you can 'build' the instructions.<br>Alternatively, you could use ExtBASasm by Darren Salt.<p>This version will work in BASIC:<pre>REM &gt;fpmulREMREM Short example to multiply two integers via theREM floating point unit. Totally pointless, but...DIM code% 20FOR loop% = 0 TO 2 STEP 2  P% = code%  [  OPT loop%   .multiply     EQUD   &amp;EE000110   ; FLTS F0, R2     EQUD   &amp;EE011110   ; FLTS F1, R1     EQUD   &amp;EE902101   ; FMLS F2, F0, F1     EQUD   &amp;EE100112   ; FIXS R0, F2     MOVS   PC, R14  ]NEXTINPUT &quot;First number  : &quot;one%INPUT &quot;Second number : &quot;two%A% = one%B% = two%result% = USR(multiply)PRINT &quot;The result is &quot;+STR$(result%)END</pre><div align = right><a href="sw/fpmul.basic"><i>Download this example</i></a></div><p>&nbsp;<p>One final thing... Remember to use the appropriate precision for what you are doing.<pre>REM &gt;precisionREMREM Short example to show how data can be 'lost' dueREM to using incorrect precision.ON ERROR PRINT REPORT$ + &quot; at &quot; + STR$(ERL/10) : ENDDIM code% 64FOR loop% = 0 TO 2 STEP 2  P% = code%  [  OPT loop%     EXT 1   .single_precision     FLTS   F0, R0     FIX    R0, F0     MOV    PC, R14   .double_precision     FLTD   F0, R0     FIX    R0, F0     MOV    PC, R14   .doubleext_precision     FLTE   F0, R0     FIX    R0, F0     MOV    PC, R14  ]NEXTA% = &amp;1ffffffPRINT &quot;Original input is &quot; + STR$~A%PRINT &quot;Single precision  &quot; + STR$~(USR(single_precision))PRINT &quot;Double precision  &quot; + STR$~(USR(double_precision))PRINT &quot;Double extended   &quot; + STR$~(USR(doubleext_precision))PRINTEND</pre>The result of this program is:<pre>Original input is 1FFFFFFSingle precision  2000000Double precision  1FFFFFFDouble extended   1FFFFFF</pre>You don't need to use double precision everywhere, though, as it will be that much slower. Simplykeep this in mind if you are dealing with large numbers.<p>&nbsp;<p>In order to test the actual speed differences, I wrote a test program:<pre>DIM code% 64FOR loop% = 0 TO 2 STEP 2  P% = code%  [  OPT loop%     MOV    R0, #23     MOV    R1, #1&lt;&lt;16   .timetest     FLTD   F0, R0     FLTD   F1, R0     MUFD   F2, F0, F1     SUBS   R1, R1, #1     BNE    timetest     MOV    PC, R14  ]NEXTt% = TIMECALL code%PRINT "That took "+STR$(TIME - t%)+" centiseconds."END</pre>I tried various precisions, and also the fast multiply. It showed something interesting. So Itried multiplication, and addition. All with the same data (input 23).<p>&nbsp;<p>Here are my results for a million (roughly) convert-and-process operations (ARM710 processor,FPEmulator 4.14):<pre>   <u>Operation        Fast single   Single        Double        Double extended</u>   Multiplication   1731cs        1755cs        1965cs        1712cs   Division         2169cs        2169cs        2618cs        2479cs   Addition         n/a           1684cs        1899cs        1646cs</pre>This seems to show that double extended precision is the fastest on my machine for a selection ofoperations. Thus, it is incorrect to simply assume more complexity takes longer time. My personalsuspicion here is the internal format <i>is</i> double extended, thus working directly with itentails no loss due to converting the value to a different precision.<p>The moral here? Don't be afraid to experiment...<p><hr size = 3><a href="index.html#10">Return to assembler index</a><hr size = 3><address>Copyright &copy; 2000 Richard Murray</address></body></html>
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -