📄 math.txt
字号:
CONTENTS
--------
* Math Support
* Floating Point
* Floating point Exception Flags
* IEEE754 interoperability
* Fixed Point, Introduction
* Constants
* Fixed Point Interoperability
* Integer Libraries
* Fixed Point Libraries
* Floating Point Library
* Fast and Compact Inline Operations
* Using Prototypes and Multiple Code Pages
* Fixed Point Example
* Floating Point Example
* Code Size Comparison
* How to save code space
MATH SUPPORT
------------
The math support includes integer, fixed and floating point math
including library functions:
Integer: 8, 16, 24 and 32 bit, with and without sign
Fixed point: 20 different formats, with and without sign
Floating point: 16, 24 and 32 bit
Math support for each compiler edition:
STANDARD EXTENDED
int 8+16+24 8+16+24+32
fixed 8+16+24 8+16+24+32
float 24+32 16+24+32
The compiler will automatically locate the required function for
an operation like 'a*b'.
The following command line options are available:
-we: no warning when fixed point constants are rounded
-wO: warning on operator library calls
-wI: warning on long inline code for multiplication and division
Fixed point requires manual worst case analysis to get correct
results. This must include calculation of accumulated error and
avoiding truncation and loss of significant bits. It is often
straight forward to get correct results when using floating point.
However, floating point functions requires significantly more
code.
In general, floating point and fixed point are both slow to
execute. Note that floating point is FASTER than fixed point on
multiplication and division, but slower on most other operations.
Operations not found in the libraries are handled by the built in
code generator. Also, the compiler will use inline code for
operations that are most efficient handled inline.
SAVE CODE AND SAVE RAM: All libraries are optimized to get compact
code. The floating point library is more compact than the
Microchip floating point libraries written in assembly. All
variables (except for the floating point flags) are allocated on
the generated stack to enable efficient RAM reuse with other local
variables. A new concept of transparent sharing of parameters in a
library is introduced to save code.
CC8E will automatically delete unused library functions. This
feature can also be used to delete unused user application
functions.
#pragma library 1
.. library functions that are deleted if unused
#pragma library 0
.. remaining user application
The normal use of '#pragma library' is around source library
files that are included in the user application.
FLOATING POINT
--------------
The compiler supports 16, 24 and 32 bit floating point. The
32 bit floating point can be converted to and from IEEE754 by
3 instructions (macro in math32f.h).
Format Resolution Range
16 bit 2.4 digits +/- 3.4e38, +/- 1.1e-38
24 bit 4.8 digits +/- 3.4e38, +/- 1.1e-38
32 bit 7.2 digits +/- 3.4e38, +/- 1.1e-38
Note that 16 bit floating point is intended for special
use where accuracy is less important.
Supported floating point types:
float16 : 16 bit floating point
float, float24 : 24 bit floating point
double, float32 : 32 bit floating point
32 bit floating point format:
address ID
X a.low8 : LSB, bit 0-7 of mantissa
X+1 a.midL8 : bit 8-15 of mantissa
X+2 a.midH8 : bit 16-22 of mantissa, bit 23: sign bit
X+3 a.high8 : MSB, bit 0-7 of exponent, with bias 0x7F
bit 23 of mantissa is a hidden bit, always equal to 1
zero (0.0) : a.high8 = 0 (mantissa & sign ignored)
MSB LSB
7F 00 00 00 : 1.0 = 1.0 * 2**(0x7F-0x7F) = 1.0 * 1
7F 80 00 00 : -1.0 = -1.0 * 2**(0x7F-0x7F) = -1.0 * 1
80 00 00 00 : 2.0 = 1.0 * 2**(0x80-0x7F) = 1.0 * 2
80 40 00 00 : 3.0 = 1.5 * 2**(0x80-0x7F) = 1.5 * 2
7E 60 00 00 : 0.875 = 1.75 * 2**(0x7E-0x7F) = 1.75 * 0.5
7F 60 00 00 : 1.75 = 1.75 * 2**(0x7E-0x7F) = 1.75 * 1
7F 7F FF FF : 1.9999998808
00 7C E3 5A : 0.0 (mantissa & sign ignored)
00 00 00 00 : 0.0
01 00 00 00 : 1.1754943508e-38 : smallest number above zero
FE 7F FF FF : 3.4028234664e+38 : largest number
FF 00 00 00 : +INF : positive infinity
FF 80 00 00 : -INF : negative infinity
24 bit floating point format:
address ID
X a.low8 : LSB, bit 0-7 of mantissa
X+1 a.mid8 : bit 8-14 of mantissa, bit 15: sign bit
X+2 a.high8 : MSB, bit 0-7 of exponent, with bias 0x7F
bit 15 of mantissa is a hidden bit, always equal to 1
zero (0.0) : a.high8 = 0 (mantissa & sign ignored)
MSB LSB
7F 00 00 : 1.0 = 1.0 * 2**(0x7F-0x7F) = 1.0 * 1
7F 80 00 : -1.0 = -1.0 * 2**(0x7F-0x7F) = -1.0 * 1
80 00 00 : 2.0 = 1.0 * 2**(0x80-0x7F) = 1.0 * 2
80 40 00 : 3.0 = 1.5 * 2**(0x80-0x7F) = 1.5 * 2
7E 60 00 : 0.875 = 1.75 * 2**(0x7E-0x7F) = 1.75 * 0.5
7F 60 00 : 1.75 = 1.75 * 2**(0x7E-0x7F) = 1.75 * 1
7F 7F FF : 1.999969482
00 7C 5A : 0.0 (mantissa & sign ignored)
01 00 00 : 1.17549435e-38 : smallest number above zero
FE 7F FF : 3.40277175e+38 : largest number
FF 00 00 : +INF : positive infinity
FF 80 00 : -INF : negative infinity
16 bit floating point format:
address ID
X a.low8 : LSB, bit 0-6 of mantissa, bit 7: sign bit
X+1 a.high8 : MSB, bit 0-7 of exponent, with bias 0x7F
bit 7 of mantissa is a hidden bit, always equal to 1
zero (0.0) : a.high8 = 0 (mantissa & sign ignored)
MSB LSB
7F 00 : 1.0 = 1.0 * 2**(0x7F-0x7F) = 1.0 * 1
7F 80 : -1.0 = -1.0 * 2**(0x7F-0x7F) = -1.0 * 1
80 00 : 2.0 = 1.0 * 2**(0x80-0x7F) = 1.0 * 2
80 40 : 3.0 = 1.5 * 2**(0x80-0x7F) = 1.5 * 2
7E 60 : 0.875 = 1.75 * 2**(0x7E-0x7F) = 1.75 * 0.5
7F 60 : 1.75 = 1.75 * 2**(0x7E-0x7F) = 1.75 * 1
7F 7F : 1.9921875
00 7C : 0.0 (mantissa & sign ignored)
01 00 : 1.175494-38 : smallest number above zero
FE 7F : 3.389531+38 : largest number
FF 00 : +INF : positive infinity
FF 80 : -INF : negative infinity
FLOATING POINT EXCEPTION FLAGS
------------------------------
The floating point flags are accessible in the application program.
At program startup the flags should be initialized:
FpFlags = 0; // reset all flags, disable rounding
FpRounding = 1; // enable rounding
Also, after an exception is detected and handled in the
application, the exception bit should be cleared so that new
exceptions can be detected. Exceptions can be ignored if this is
most convenient. New operations are not affected by old
exceptions. This also enables delayed handling of exceptions. Only
the application program can clear exception flags.
char FpFlags; // contains the floating point flags
bit FpOverflow @ FpFlags.1; // floating point overflow
bit FpUnderFlow @ FpFlags.2; // floating point underflow
bit FpDiv0 @ FpFlags.3; // floating point divide by zero
bit FpDomainError @ FpFlags.5; // domain error exception
bit FpRounding @ FpFlags.6; // floating point rounding
// FpRounding=0: truncation
// FpRounding=1: unbiased rounding to nearest LSB
IEEE754 INTEROPERABILITY
------------------------
The floating point format used is not equivalent to the IEEE754
standard, but the difference is very small. The reason for using a
different format is code efficiency.
IEEE compatibility is needed when floating point values are
exchanged with the outside world. It may also happen that
inspecting variables during debugging requires the IEEE754 format
on some emulators/debuggers.
Macros for converting to and from IEEE754 are available:
math32f.h:
// before sending a floating point value out of the controller:
float32ToIEEE754(floatVar); // change to IEEE754 (3 instr.)
// before using a floating point value received from outside:
IEEE754ToFloat32(floatVar); // change from IEEE754 (3 instr.)
math24f.h:
float24ToIEEE754(floatVar); // change to IEEE754 (3 instr.)
IEEE754ToFloat24(floatVar); // change from IEEE754 (3 instr.)
FIXED POINT, INTRODUCTION
-------------------------
Fixed point can be used instead of floating point, mainly to save
program space. Fixed point math use formats where the decimal
point is permanently set at byte boundaries. For example, fixed8_8
use one byte for the integer part and one byte for the decimal
part. Fixed point operations maps nicely to integer operations
except for multiplication and division which are supported by
library functions.
Example: fixed8_8 fx;
fx.low8 : Least significant byte, decimal part
fx.high8 : Most significant byte, integer part
MSB LSB 1/256 = 0.00390625
07 01 : 7 + 0x01*0.00390625 = 7.0039625
07 80 : 7 + 0x80*0.00390625 = 7.5
07 FF : 7 + 0xFF*0.00390625 = 7.99609375
00 00 : 0
FF 00 : -1
FF FF : -1 + 0xFF*0.00390625 = -0.0039625
7F 00 : +127
7F FF : +127 + 0xFF*0.00390625 = 127.99609375
80 00 : -128
Convention: fixed<S><I>_<D> :
<S> : 'U' : unsigned
<none>: signed
<I> : number of integer bits
<D> : number of decimal bits
Thus, fixed16_8 uses 16 bits for the integer part plus 8 bits
for the decimals, a total of 24 bits. The resolution for fixed16_8
is 1/256=0.0039 which is the lowest possible increment. This is
equivalent to 2 decimal digits (actually 2.4 decimal digits).
Built in fixed point types:
Type: #bytes Range Resolution
fixed8_8 2 (1+1) -128, +127.996 0.00390625
fixed8_16 3 (1+2) -128, +127.99998 0.000015259
fixed8_24 4 (1+3) -128, +127.99999994 0.000000059605
fixed16_8 3 (2+1) -32768, +32767.996 0.00390625
fixed16_16 4 (2+2) -32768, +32767.99998 0.000015259
fixed24_8 4 (3+1) -8388608, +8388607.996 0.00390625
fixedU8_8 2 (1+1) 0, +255.996 0.00390625
fixedU8_16 3 (1+2) 0, +255.99998 0.000015259
fixedU8_24 4 (1+3) 0, +255.99999994 0.000000059605
fixedU16_8 3 (2+1) 0, +65535.996 0.00390625
fixedU16_16 4 (2+2) 0, +65535.99998 0.000015259
fixedU24_8 4 (3+1) 0, +16777215.996 0.00390625
(additional types with decimals only; no integer part)
fixed_8 1 (0+1) -0.5, +0.496 0.00390625
fixed_16 2 (0+2) -0.5, +0.49998 0.000015259
fixed_24 3 (0+3) -0.5, +0.49999994 0.000000059605
fixed_32 4 (0+4) -0.5, +0.4999999998 0.0000000002328
fixedU_8 1 (0+1) 0, +0.996 0.00390625
fixedU_16 2 (0+2) 0, +0.99998 0.000015259
fixedU_24 3 (0+3) 0, +0.99999994 0.000000059605
fixedU_32 4 (0+4) 0, +0.9999999998 0.0000000002328
To sum up:
1. All types ending on _8 have 2 correct digits after decimal
point and a maximum error of 2 on the 3rd decimal digit.
2. All types ending on _16 have 4 correct digits after decimal
point and a maximum error of 1 on the 5th decimal digit.
3. All types ending on _24 have 7 correct digits after decimal
point and a maximum error of 3 on the 8th decimal digit.
4. All types ending on _32 have 9 correct digits after decimal
point and a maximum error of 2 on the 11th decimal digit.
FIXED POINT CONSTANTS
---------------------
The 32 bit floating point format is used during compilation and
calculation.
fixed8_8 a = 10.24;
fixed16_8 a = 8 * 1.23;
fixed8_16 x = 2.3e-3;
fixed8_16 x = 23.45e1;
fixed8_16 x = 23.45e-2;
fixed8_16 x = 0.;
fixed8_16 x = -1.23;
Constant rounding error example:
Constant: 0.036
Variable type: fixed16_8 (1 byte for decimals)
Error calculation: 0.036*256=9.216
The byte values assigned to the variable are simply: 0, 0, 9
The error is: (9/256-0.036)/0.036 = -0.023
The compiler prints this normalized error as a warning.
TYPE CONVERSION
---------------
The fixed point types are handled as subtypes of float. Type casts
are therefore infrequently required.
FIXED POINT INTEROPERABILITY
----------------------------
It is recommended to stick to one fixed point format in a program.
The main problem when using mixed types is the enormous number of
combinations which makes library support a challenge. However,
many mixed operations are allowed when CC8E can map the types to
the built in integer code generator:
fixed8_16 a, b;
fixed_16 c;
a = b + c; // OK, code is generated directly
a = b * 10.22; // OK: library function is supplied
a = b * c; // a new user library function is required!
// A type cast can select an existing library function:
a = b * (fixed8_16)c;
INTEGER LIBRARIES
-----------------
The math integer libraries allows selection between different
optimizations, speed or size.
The libraries contains operations for multiplication, division
and division remainder.
math16.h : basic library, up to 16 bit, signed and unsigned
math24.h : basic library, up to 24 bit, signed and unsigned
math32.h : basic library, up to 32 bit, signed and unsigned
math16m.h : speed & size, 8*8, 16*16
math24m.h : speed & size, 8*8, 16*16, and 24*8 multiply.
math32m.h : speed & size, 8*8, 16*16, and 32*8 multiply.
These libraries can be used when execution speed
is critical.
NOTE 1: they must be included first (before math??.h)
NOTE 2: math??.h contains similar functions (which
are deleted)
The min and max timing cycles have been found by simulating many
thousands calculations. However, the min and max limits are not
quaranteed to be correct.
Sign: -: unsigned, S: signed
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -