📄 softfloat.txt

📁 sun2,sun3,sparcstation2 emulator
💻 TXT
📖 第 1 页 / 共 2 页
字号:
12 下一页

SoftFloat Release 2b General Documentation

John R. Hauser
2002 May 27


----------------------------------------------------------------------------
Introduction

SoftFloat is a software implementation of floating-point that conforms to
the IEC/IEEE Standard for Binary Floating-Point Arithmetic.  As many as four
formats are supported:  single precision, double precision, extended double
precision, and quadruple precision.  All operations required by the standard
are implemented, except for conversions to and from decimal.

This document gives information about the types defined and the routines
implemented by SoftFloat.  It does not attempt to define or explain the
IEC/IEEE Floating-Point Standard.  Details about the standard are available
elsewhere.


----------------------------------------------------------------------------
Limitations

SoftFloat is written in C and is designed to work with other C code.  The
SoftFloat header files assume an ISO/ANSI-style C compiler.  No attempt
has been made to accomodate compilers that are not ISO-conformant.  In
particular, the distributed header files will not be acceptable to any
compiler that does not recognize function prototypes.

Support for the extended double-precision and quadruple-precision formats
depends on a C compiler that implements 64-bit integer arithmetic.  If the
largest integer format supported by the C compiler is 32 bits, SoftFloat
is limited to only single and double precisions.  When that is the case,
all references in this document to extended double precision, quadruple
precision, and 64-bit integers should be ignored.


----------------------------------------------------------------------------
Contents

    Introduction
    Limitations
    Contents
    Legal Notice
    Types and Functions
    Rounding Modes
    Extended Double-Precision Rounding Precision
    Exceptions and Exception Flags
    Function Details
        Conversion Functions
        Standard Arithmetic Functions
        Remainder Functions
        Round-to-Integer Functions
        Comparison Functions
        Signaling NaN Test Functions
        Raise-Exception Function
    Contact Information



----------------------------------------------------------------------------
Legal Notice

SoftFloat was written by John R. Hauser.  This work was made possible in
part by the International Computer Science Institute, located at Suite 600,
1947 Center Street, Berkeley, California 94704.  Funding was partially
provided by the National Science Foundation under grant MIP-9311980.  The
original version of this code was written as part of a project to build
a fixed-point vector processor in collaboration with the University of
California at Berkeley, overseen by Profs. Nelson Morgan and John Wawrzynek.

THIS SOFTWARE IS DISTRIBUTED AS IS, FOR FREE.  Although reasonable effort
has been made to avoid it, THIS SOFTWARE MAY CONTAIN FAULTS THAT WILL AT
TIMES RESULT IN INCORRECT BEHAVIOR.  USE OF THIS SOFTWARE IS RESTRICTED TO
PERSONS AND ORGANIZATIONS WHO CAN AND WILL TAKE FULL RESPONSIBILITY FOR ALL
LOSSES, COSTS, OR OTHER PROBLEMS THEY INCUR DUE TO THE SOFTWARE, AND WHO
FURTHERMORE EFFECTIVELY INDEMNIFY JOHN HAUSER AND THE INTERNATIONAL COMPUTER
SCIENCE INSTITUTE (possibly via similar legal warning) AGAINST ALL LOSSES,
COSTS, OR OTHER PROBLEMS INCURRED BY THEIR CUSTOMERS AND CLIENTS DUE TO THE
SOFTWARE.


----------------------------------------------------------------------------
Types and Functions

When 64-bit integers are supported by the compiler, the `softfloat.h'
header file defines four types:  `float32' (single precision), `float64'
(double precision), `floatx80' (extended double precision), and `float128'
(quadruple precision).  The `float32' and `float64' types are defined in
terms of 32-bit and 64-bit integer types, respectively, while the `float128'
type is defined as a structure of two 64-bit integers, taking into account
the byte order of the particular machine being used.  The `floatx80' type
is defined as a structure containing one 16-bit and one 64-bit integer, with
the machine's byte order again determining the order within the structure.

When 64-bit integers are _not_ supported by the compiler, the `softfloat.h'
header file defines only two types:  `float32' and `float64'.  Because
ISO/ANSI C guarantees at least one built-in integer type of 32 bits,
the `float32' type is identified with an appropriate integer type.  The
`float64' type is defined as a structure of two 32-bit integers, with the
machine's byte order determining the order of the fields.

In either case, the types in `softfloat.h' are defined such that if a system
implements the usual C `float' and `double' types according to the IEC/IEEE
Standard, then the `float32' and `float64' types should be indistinguishable
in memory from the native `float' and `double' types.  (On the other hand,
when `float32' or `float64' values are placed in processor registers by
the compiler, the type of registers used may differ from those used for the
native `float' and `double' types.)

SoftFloat implements the following arithmetic operations:

-- Conversions among all the floating-point formats, and also between
   integers (32-bit and 64-bit) and any of the floating-point formats.

-- The usual add, subtract, multiply, divide, and square root operations
   for all floating-point formats.

-- For each format, the floating-point remainder operation defined by the
   IEC/IEEE Standard.

-- For each floating-point format, a ``round to integer'' operation that
   rounds to the nearest integer value in the same format.  (The floating-
   point formats can hold integer values, of course.)

-- Comparisons between two values in the same floating-point format.

The only functions required by the IEC/IEEE Standard that are not provided
are conversions to and from decimal.


----------------------------------------------------------------------------
Rounding Modes

All four rounding modes prescribed by the IEC/IEEE Standard are implemented
for all operations that require rounding.  The rounding mode is selected
by the global variable `float_rounding_mode'.  This variable may be set
to one of the values `float_round_nearest_even', `float_round_to_zero',
`float_round_down', or `float_round_up'.  The rounding mode is initialized
to nearest/even.


----------------------------------------------------------------------------
Extended Double-Precision Rounding Precision

For extended double precision (`floatx80') only, the rounding precision
of the standard arithmetic operations is controlled by the global variable
`floatx80_rounding_precision'.  The operations affected are:

   floatx80_add   floatx80_sub   floatx80_mul   floatx80_div   floatx80_sqrt

When `floatx80_rounding_precision' is set to its default value of 80, these
operations are rounded (as usual) to the full precision of the extended
double-precision format.  Setting `floatx80_rounding_precision' to 32
or to 64 causes the operations listed to be rounded to reduced precision
equivalent to single precision (`float32') or to double precision
(`float64'), respectively.  When rounding to reduced precision, additional
bits in the result significand beyond the rounding point are set to zero.
The consequences of setting `floatx80_rounding_precision' to a value other
than 32, 64, or 80 is not specified.  Operations other than the ones listed
above are not affected by `floatx80_rounding_precision'.


----------------------------------------------------------------------------
Exceptions and Exception Flags

All five exception flags required by the IEC/IEEE Standard are
implemented.  Each flag is stored as a unique bit in the global variable
`float_exception_flags'.  The positions of the exception flag bits within
this variable are determined by the bit masks `float_flag_inexact',
`float_flag_underflow', `float_flag_overflow', `float_flag_divbyzero', and
`float_flag_invalid'.  The exception flags variable is initialized to all 0,
meaning no exceptions.

An individual exception flag can be cleared with the statement

    float_exception_flags &= ~ float_flag_<exception>;

where `<exception>' is the appropriate name.  To raise a floating-point
exception, the SoftFloat function `float_raise' should be used (see below).

In the terminology of the IEC/IEEE Standard, SoftFloat can detect tininess
for underflow either before or after rounding.  The choice is made by
the global variable `float_detect_tininess', which can be set to either
`float_tininess_before_rounding' or `float_tininess_after_rounding'.
Detecting tininess after rounding is better because it results in fewer
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -