📄 library_28.html

📁 Linux程序员的工作手册
💻 HTML
📖 第 1 页 / 共 4 页
字号:
上一页 1 2 34
floating point number with the same value.  <DFN>Normalization</DFN> consistsof doing this repeatedly until the number is normalized.  Two distinctnormalized floating point numbers cannot be equal in value.<P>(There is an exception to this rule: if the mantissa is zero, it isconsidered normalized.  Another exception happens on certain machineswhere the exponent is as small as the representation can hold.  Thenit is impossible to subtract <CODE>1</CODE> from the exponent, so a numbermay be normalized even if its fraction is less than <CODE>1/<VAR>b</VAR></CODE>.)<P><H4><A NAME="SEC489" HREF="library_toc.html#SEC489" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC489">Floating Point Parameters</A></H4><A NAME="IDX1974"></A><P>These macro definitions can be accessed by including the header file<TT>`float.h'</TT> in your program.<P>Macro names starting with <SAMP>`FLT_'</SAMP> refer to the <CODE>float</CODE> type,while names beginning with <SAMP>`DBL_'</SAMP> refer to the <CODE>double</CODE> typeand names beginning with <SAMP>`LDBL_'</SAMP> refer to the <CODE>long double</CODE>type.  (Currently GCC does not support <CODE>long double</CODE> as a distinctdata type, so the values for the <SAMP>`LDBL_'</SAMP> constants are equal to thecorresponding constants for the <CODE>double</CODE> type.)<P>Of these macros, only <CODE>FLT_RADIX</CODE> is guaranteed to be a constantexpression.  The other macros listed here cannot be reliably used inplaces that require constant expressions, such as <SAMP>`#if'</SAMP>preprocessing directives or in the dimensions of static arrays.<P>Although the ANSI C standard specifies minimum and maximum values formost of these parameters, the GNU C implementation uses whatever valuesdescribe the floating point representation of the target machine.  So inprinciple GNU C actually satisfies the ANSI C requirements only if thetarget machine is suitable.  In practice, all the machines currentlysupported are suitable.<P><DL COMPACT><DT><CODE>FLT_ROUNDS</CODE><DD>This value characterizes the rounding mode for floating point addition.The following values indicate standard rounding modes:<P><DL COMPACT><DT><CODE>-1</CODE><DD>The mode is indeterminable.<DT><CODE>0</CODE><DD>Rounding is towards zero.<DT><CODE>1</CODE><DD>Rounding is to the nearest number.<DT><CODE>2</CODE><DD>Rounding is towards positive infinity.<DT><CODE>3</CODE><DD>Rounding is towards negative infinity.</DL><P>Any other value represents a machine-dependent nonstandard roundingmode.<P>On most machines, the value is <CODE>1</CODE>, in accordance with the IEEEstandard for floating point.<P>Here is a table showing how certain values round for each possible valueof <CODE>FLT_ROUNDS</CODE>, if the other aspects of the representation matchthe IEEE single-precision standard.<P><PRE>                 0       1              2              3 1.00000003     1.0     1.0            1.00000012     1.0 1.00000007     1.0     1.00000012     1.00000012     1.0-1.00000003    -1.0    -1.0           -1.0           -1.00000012-1.00000007    -1.0    -1.00000012    -1.0           -1.00000012</PRE><P><LI>FLT_RADIXThis is the value of the base, or radix, of exponent representation.This is guaranteed to be a constant expression, unlike the other macrosdescribed in this section.  The value is 2 on all machines we know ofexcept the IBM 360 and derivatives.<P><LI>FLT_MANT_DIGThis is the number of base-<CODE>FLT_RADIX</CODE> digits in the floating pointmantissa for the <CODE>float</CODE> data type.  The following expressionyields <CODE>1.0</CODE> (even though mathematically it should not) due to thelimited number of mantissa digits:<P><PRE>float radix = FLT_RADIX;1.0f + 1.0f / radix / radix / ... / radix</PRE><P>where <CODE>radix</CODE> appears <CODE>FLT_MANT_DIG</CODE> times.<P><LI>DBL_MANT_DIG<LI>LDBL_MANT_DIGThis is the number of base-<CODE>FLT_RADIX</CODE> digits in the floating pointmantissa for the data types <CODE>double</CODE> and <CODE>long double</CODE>,respectively.<P><LI>FLT_DIG<P>This is the number of decimal digits of precision for the <CODE>float</CODE>data type.  Technically, if <VAR>p</VAR> and <VAR>b</VAR> are the precision andbase (respectively) for the representation, then the decimal precision<VAR>q</VAR> is the maximum number of decimal digits such that any floatingpoint number with <VAR>q</VAR> base 10 digits can be rounded to a floatingpoint number with <VAR>p</VAR> base <VAR>b</VAR> digits and back again, withoutchange to the <VAR>q</VAR> decimal digits.<P>The value of this macro is supposed to be at least <CODE>6</CODE>, to satisfyANSI C.<P><LI>DBL_DIG<LI>LDBL_DIG<P>These are similar to <CODE>FLT_DIG</CODE>, but for the data types<CODE>double</CODE> and <CODE>long double</CODE>, respectively.  The values of thesemacros are supposed to be at least <CODE>10</CODE>.<P><LI>FLT_MIN_EXPThis is the smallest possible exponent value for type <CODE>float</CODE>.More precisely, is the minimum negative integer such that the value<CODE>FLT_RADIX</CODE> raised to this power minus 1 can be represented as anormalized floating point number of type <CODE>float</CODE>.<P><LI>DBL_MIN_EXP<LI>LDBL_MIN_EXP<P>These are similar to <CODE>FLT_MIN_EXP</CODE>, but for the data types<CODE>double</CODE> and <CODE>long double</CODE>, respectively.<P><LI>FLT_MIN_10_EXPThis is the minimum negative integer such that <CODE>10</CODE> raised to thispower minus 1 can be represented as a normalized floating point numberof type <CODE>float</CODE>.  This is supposed to be <CODE>-37</CODE> or even less.<P><LI>DBL_MIN_10_EXP<LI>LDBL_MIN_10_EXPThese are similar to <CODE>FLT_MIN_10_EXP</CODE>, but for the data types<CODE>double</CODE> and <CODE>long double</CODE>, respectively.<P><LI>FLT_MAX_EXPThis is the largest possible exponent value for type <CODE>float</CODE>.  Moreprecisely, this is the maximum positive integer such that value<CODE>FLT_RADIX</CODE> raised to this power minus 1 can be represented as afloating point number of type <CODE>float</CODE>.<P><LI>DBL_MAX_EXP<LI>LDBL_MAX_EXPThese are similar to <CODE>FLT_MAX_EXP</CODE>, but for the data types<CODE>double</CODE> and <CODE>long double</CODE>, respectively.<P><LI>FLT_MAX_10_EXPThis is the maximum positive integer such that <CODE>10</CODE> raised to thispower minus 1 can be represented as a normalized floating point numberof type <CODE>float</CODE>.  This is supposed to be at least <CODE>37</CODE>.<P><LI>DBL_MAX_10_EXP<LI>LDBL_MAX_10_EXPThese are similar to <CODE>FLT_MAX_10_EXP</CODE>, but for the data types<CODE>double</CODE> and <CODE>long double</CODE>, respectively.<P><LI>FLT_MAX<P>The value of this macro is the maximum number representable in type<CODE>float</CODE>.  It is supposed to be at least <CODE>1E+37</CODE>.  The valuehas type <CODE>float</CODE>.<P>The smallest representable number is <CODE>- FLT_MAX</CODE>.<P><LI>DBL_MAX<LI>LDBL_MAX<P>These are similar to <CODE>FLT_MAX</CODE>, but for the data types<CODE>double</CODE> and <CODE>long double</CODE>, respectively.  The type of themacro's value is the same as the type it describes.<P><LI>FLT_MIN<P>The value of this macro is the minimum normalized positive floatingpoint number that is representable in type <CODE>float</CODE>.  It is supposedto be no more than <CODE>1E-37</CODE>.<P><LI>DBL_MIN<LI>LDBL_MIN<P>These are similar to <CODE>FLT_MIN</CODE>, but for the data types<CODE>double</CODE> and <CODE>long double</CODE>, respectively.  The type of themacro's value is the same as the type it describes.<P><LI>FLT_EPSILON<P>This is the minimum positive floating point number of type <CODE>float</CODE>such that <CODE>1.0 + FLT_EPSILON != 1.0</CODE> is true.  It's supposed tobe no greater than <CODE>1E-5</CODE>.<P><LI>DBL_EPSILON<LI>LDBL_EPSILON<P>These are similar to <CODE>FLT_EPSILON</CODE>, but for the data types<CODE>double</CODE> and <CODE>long double</CODE>, respectively.  The type of themacro's value is the same as the type it describes.  The values are notsupposed to be greater than <CODE>1E-9</CODE>.</DL><P><A NAME="IDX1975"></A><A NAME="IDX1976"></A><H4><A NAME="SEC490" HREF="library_toc.html#SEC490" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC490">IEEE Floating Point</A></H4><P>Here is an example showing how the floating type measurements come outfor the most common floating point representation, specified by the<CITE>IEEE Standard for Binary Floating Point Arithmetic (ANSI/IEEE Std754-1985)</CITE>.  Nearly all computers designed since the 1980s use thisformat.<P>The IEEE single-precision float representation uses a base of 2.  Thereis a sign bit, a mantissa with 23 bits plus one hidden bit (so the totalprecision is 24 base-2 digits), and an 8-bit exponent that can representvalues in the range -125 to 128, inclusive.<P>So, for an implementation that uses this representation for the<CODE>float</CODE> data type, appropriate values for the correspondingparameters are:<P><PRE>FLT_RADIX                             2FLT_MANT_DIG                         24FLT_DIG                               6FLT_MIN_EXP                        -125FLT_MIN_10_EXP                      -37FLT_MAX_EXP                         128FLT_MAX_10_EXP                      +38FLT_MIN                 1.17549435E-38FFLT_MAX                 3.40282347E+38FFLT_EPSILON             1.19209290E-07F</PRE><P>Here are the values for the <CODE>double</CODE> data type:<P><PRE>DBL_MANT_DIG                         53DBL_DIG                              15DBL_MIN_EXP                       -1021DBL_MIN_10_EXP                     -307DBL_MAX_EXP                        1024DBL_MAX_10_EXP                      308DBL_MAX         1.7976931348623157E+308DBL_MIN         2.2250738585072014E-308DBL_EPSILON     2.2204460492503131E-016</PRE><P><H3><A NAME="SEC491" HREF="library_toc.html#SEC491" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_toc.html#SEC491">Structure Field Offset Measurement</A></H3><P>You can use <CODE>offsetof</CODE> to measure the location within a structuretype of a particular structure member.<P><A NAME="IDX1977"></A><U>Macro:</U> size_t <B>offsetof</B> <I>(<VAR>type</VAR>, <VAR>member</VAR>)</I><P>This expands to a integer constant expression that is the offset of thestructure member named <VAR>member</VAR> in a the structure type <VAR>type</VAR>.For example, <CODE>offsetof (struct s, elem)</CODE> is the offset, in bytes,of the member <CODE>elem</CODE> in a <CODE>struct s</CODE>.<P>This macro won't work if <VAR>member</VAR> is a bit field; you get an errorfrom the C compiler in that case.<P><P>Go to the <A HREF="library_27.html" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_27.html">previous</A>, <A HREF="library_29.html" tppabs="http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_29.html">next</A> section.<P>
上一页 1 2 34
💿 文件大小 399 K
👤 上传用户 cq745
📂 所属分类 Linux/Unix编程
🏷️ 相关标签

#Linux #程序员 #工作手册
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -