Floating-point arithmetic: Difference between revisions

Content deleted Content added
Internal representation: For the first part of the table, changed "Bits" to "Bits for the encoding". Now, since this is about the encoding, it should be clear that the number given for the significand excludes the implicit bit, when this is used.
Line 241:
While the exponent can be positive or negative, in binary formats it is stored as an unsigned number that has a fixed "bias" added to it. Values of all 0s in this field are reserved for the zeros and [[subnormal numbers]]; values of all 1s are reserved for the infinities and NaNs. The exponent range for normal numbers is [−126, 127] for single precision, [−1022, 1023] for double, or [−16382, 16383] for quad. Normal numbers exclude subnormal values, zeros, infinities, and NaNs.
 
In the IEEE binary interchange formats the leading 1 bit of a normalized significand is not actually stored in the computer datum, since it is always 1. It is called the "hidden" or "implicit" bit. Because of this, the single-precision format actually has a significand with 24 bits of precision, the double-precision format has 53, and quad has 113.
 
For example, it was shown above that π, rounded to 24 bits of precision, has: