📄 definitions.qbk
字号:
documents, such as the C++ standard, the word `range` is ['sometimes] used as synonymfor `numeric set`, that is, as the ordered sequence of numeric values from `l` to `h`.In this document, however, a range is an abstract interval which subtends thenumeric set.For example, the sequence `[-DBL_MAX,DBL_MAX]` is the numeric set of the type`double`, and the real interval `[abt(-DBL_MAX),abt(DBL_MAX)]` is its range.Notice, for instance, that the range of a floating-point type is ['continuous]unlike its numeric set.This definition was chosen because:* [*(a)] The discrete set of numeric values is already given by the numeric set.* [*(b)] Abstract intervals are easier to compare and overlap since only boundaryvalues need to be considered.This definition allows for a concise definition of `subranged` as given in the last section.The width of a numeric set, as defined, is exactly equivalent to the width of a range.__SPACE__The [*precision] of a type is given by the width or density of the numeric set.For integer types, which have density 1, the precision is conceptually equivalentto the range and is determined by the number of bits used in the value representation:The higher the number of bits the bigger the size of the numeric set, the wider therange, and the higher the precision.For floating types, which have density <<1, the precision is given not by the widthof the range but by the density. In a typical implementation, the range is determinedby the number of bits used in the exponent, and the precision by the number of bitsused in the mantissa (giving the maximum number of significant digits that can beexactly represented). The higher the number of exponent bits the wider the range,while the higher the number of mantissa bits, the higher the precision.[endsect][#numeric_conversion_definitions_roundoff][section Exact, Correctly Rounded and Out-Of-Range Representations]Given an abstract value `V` and a type `T` with its corresponding range `[abt(l),abt(h)]`:If `V < abt(l)` or `V > abt(h)`, `V` is [*not representable] (cannot be represented) inthe type `T`, or, equivalently, it's representation in the type `T` is [*out of range],or [*overflows].* If `V < abt(l)`, the [*overflow is negative].* If `V > abt(h)`, the [*overflow is positive].If `V >= abt(l)` and `V <= abt(h)`, `V` is [*representable] (can be represented) in thetype `T`, or, equivalently, its representation in the type `T` is [*in range], or[*does not overflow].Notice that a numeric type, such as a C++ unsigned type, can define that any `V` doesnot overflow by always representing not `V` itself but the abstract value`U = [ V % (abt(h)+1) ]`, which is always in range.Given an abstract value `V` represented in the type `T` as `v`, the [*roundoff] errorof the representation is the abstract difference: `(abt(v)-V)`.Notice that a representation is an ['operation], hence, the roundoff error correspondsto the representation operation and not to the numeric value itself(i.e. numeric values do not have any error themselves)* If the roundoff is 0, the representation is [*exact], and `V` is exactly representablein the type `T`.* If the roundoff is not 0, the representation is [*inexact], and `V` is inexactlyrepresentable in the type `T`.If a representation `v` in a type `T` -either exact or inexact-, is any of the adjacentsof `V` in that type, that is, if `v==prev` or `v==next`, the representation isfaithfully rounded. If the choice between `prev` and `next` matches a given[*rounding direction], it is [*correctly rounded].All exact representations are correctly rounded, but not all inexact representations are.In particular, C++ requires numeric conversions (described below) and the result ofarithmetic operations (not covered by this document) to be correctly rounded, butbatch operations propagate roundoff, thus final results are usually incorrectlyrounded, that is, the numeric value `r` which is the computed result is neither ofthe adjacents of the abstract value `R` which is the theoretical result.Because a correctly rounded representation is always one of adjacents of the abstractvalue being represented, the roundoff is guaranteed to be at most 1ulp.The following examples summarize the given definitions. Consider:* A numeric type `Int` representing integer numbers with a['numeric set]: `{-2,-1,0,1,2}` and['range]: `[-2,2]`* A numeric type `Cardinal` representing integer numbers with a['numeric set]: `{0,1,2,3,4,5,6,7,8,9}` and['range]: `[0,9]` (no modulo-arithmetic here)* A numeric type `Real` representing real numbers with a['numeric set]: `{-2.0,-1.5,-1.0,-0.5,-0.0,+0.0,+0.5,+1.0,+1.5,+2.0}` and['range]: `[-2.0,+2.0]`* A numeric type `Whole` representing real numbers with a['numeric set]: `{-2.0,-1.0,0.0,+1.0,+2.0}` and['range]: `[-2.0,+2.0]`First, notice that the types `Real` and `Whole` both represent real numbers,have the same range, but different precision.* The integer number `1` (an abstract value) can be exactly representedin any of these types.* The integer number `-1` can be exactly represented in `Int`, `Real` and `Whole`,but cannot be represented in `Cardinal`, yielding negative overflow.* The real number `1.5` can be exactly represented in `Real`, and inexactlyrepresented in the other types.* If `1.5` is represented as either `1` or `2` in any of the types (except `Real`),the representation is correctly rounded.* If `0.5` is represented as `+1.5` in the type `Real`, it is incorrectly rounded.* `(-2.0,-1.5)` are the `Real` adjacents of any real number in the interval`[-2.0,-1.5]`, yet there are no `Real` adjacents for `x < -2.0`, nor for `x > +2.0`.[endsect][section Standard (numeric) Conversions]The C++ language defines [_Standard Conversions] (§4) some of which are conversionsbetween arithmetic types.These are [_Integral promotions] (§4.5), [_Integral conversions] (§4.7),[_Floating point promotions] (§4.6), [_Floating point conversions] (§4.8) and[_Floating-integral conversions] (§4.9).In the sequel, integral and floating point promotions are called [*arithmetic promotions],and these plus integral, floating-point and floating-integral conversions are called[*arithmetic conversions] (i.e, promotions are conversions).Promotions, both Integral and Floating point, are ['value-preserving], which means thatthe typed value is not changed with the conversion.In the sequel, consider a source typed value `s` of type `S`, the source abstractvalue `N=abt(s)`, a destination type `T`; and whenever possible, a result typed value`t` of type `T`.Integer to integer conversions are always defined:* If `T` is unsigned, the abstract value which is effectively represented is not`N` but `M=[ N % ( abt(h) + 1 ) ]`, where `h` is the highest unsigned typedvalue of type `T`.* If `T` is signed and `N` is not directly representable, the result `t` is[_implementation-defined], which means that the C++ implementation is required toproduce a value `t` even if it is totally unrelated to `s`.Floating to Floating conversions are defined only if `N` is representable;if it is not, the conversion has [_undefined behavior].* If `N` is exactly representable, `t` is required to be the exact representation.* If `N` is inexactly representable, `t` is required to be one of the twoadjacents, with an implementation-defined choice of rounding direction;that is, the conversion is required to be correctly rounded.Floating to Integer conversions represent not `N` but `M=trunc(N)`, were`trunc()` is to truncate: i.e. to remove the fractional part, if any.* If `M` is not representable in `T`, the conversion has [_undefined behavior](unless `T` is `bool`, see §4.12).Integer to Floating conversions are always defined.* If `N` is exactly representable, `t` is required to be the exact representation.* If `N` is inexactly representable, `t` is required to be one of thetwo adjacents, with an implementation-defined choice of rounding direction;that is, the conversion is required to be correctly rounded.[endsect][#numeric_conversion_definitions_subranged][section Subranged Conversion Direction, Subtype and Supertype]Given a source type `S` and a destination type `T`, there is a[*conversion direction] denoted: `S->T`.For any two ranges the following ['range relation] can be defined:A range `X` can be ['entirely contained] in a range `Y`, in which caseit is said that `X` is enclosed by `Y`.[: [*Formally:] `R(S)` is enclosed by `R(T)` iif `(R(S) intersection R(T)) == R(S)`.]If the source type range, `R(S)`, is not enclosed in the target type range,`R(T)`; that is, if `(R(S) & R(T)) != R(S)`, the conversion direction is saidto be [*subranged], which means that `R(S)` is not entirely contained in `R(T)`and therefore there is some portion of the source range which falls outsidethe target range. In other words, if a conversion direction `S->T` is subranged,there are values in `S` which cannot be represented in `T` because they areout of range.Notice that for `S->T`, the adjective subranged applies to `T`.Examples:Given the following numeric types all representing real numbers:* `X` with numeric set `{-2.0,-1.0,0.0,+1.0,+2.0}` and range `[-2.0,+2.0]`* `Y` with numeric set `{-2.0,-1.5,-1.0,-0.5,0.0,+0.5,+1.0,+1.5,+2.0}` and range `[-2.0,+2.0]`* `Z` with numeric set `{-1.0,0.0,+1.0}` and range `[-1.0,+1.0]`For:[variablelist[[(a) X->Y:][`R(X) & R(Y) == R(X)`, then `X->Y` is not subranged.Thus, all values of type `X` are representable in the type `Y`.]][[(b) Y->X:][`R(Y) & R(X) == R(Y)`, then `Y->X` is not subranged.Thus, all values of type `Y` are representable in the type `X`, but in this case,some values are ['inexactly] representable (all the halves).(note: it is to permit this case that a range is an interval of abstract values andnot an interval of typed values)]][[(b) X->Z:][`R(X) & R(Z) != R(X)`, then `X->Z` is subranged.Thus, some values of type `X` are not representable in the type `Z`, they fallout of range `(-2.0 and +2.0)`.]]]It is possible that `R(S)` is not enclosed by `R(T)`, while neither is `R(T)` enclosedby `R(S)`; for example, `UNSIG=[0,255]` is not enclosed by `SIG=[-128,127]`;neither is `SIG` enclosed by `UNSIG`.This implies that is possible that a conversion direction is subranged both ways.This occurs when a mixture of signed/unsigned types are involved and indicates thatin both directions there are values which can fall out of range.Given the range relation (subranged or not) of a conversion direction `S->T`, itis possible to classify `S` and `T` as [*supertype] and [*subtype]:If the conversion is subranged, which means that `T` cannot represent all possiblevalues of type `S`, `S` is the supertype and `T` the subtype; otherwise, `T` is thesupertype and `S` the subtype.For example:[: `R(float)=[-FLT_MAX,FLT_MAX]` and `R(double)=[-DBL_MAX,DBL_MAX]` ]If `FLT_MAX < DBL_MAX`:* `double->float` is subranged and `supertype=double`, `subtype=float`.* `float->double` is not subranged and `supertype=double`, `subtype=float`.Notice that while `double->float` is subranged, `float->double` is not,which yields the same supertype,subtype for both directions.Now consider:[: `R(int)=[INT_MIN,INT_MAX]` and `R(unsigned int)=[0,UINT_MAX]` ]A C++ implementation is required to have `UINT_MAX > INT_MAX` (§3.9/3), so:* 'int->unsigned' is subranged (negative values fall out of range)and `supertype=int`, `subtype=unsigned`.* 'unsigned->int' is ['also] subranged (high positive values fall out of range)and `supertype=unsigned`, `subtype=int`.In this case, the conversion is subranged in both directions and thesupertype,subtype pairs are not invariant (under inversion of direction).This indicates that none of the types can represent all the values of the other.When the supertype is the same for both `S->T` and `T->S`, it is effectivelyindicating a type which can represent all the values of the subtype.Consequently, if a conversion `X->Y` is not subranged, but the opposite `(Y->X)` is,so that the supertype is always `Y`, it is said that the direction `X->Y` is [*correctlyrounded value preserving], meaning that all such conversions are guaranteed toproduce results in range and correctly rounded (even if inexact).For example, all integer to floating conversions are correctly rounded value preserving.[endsect][endsect]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -