\
In computing, octuple precision is a binary floating-point-based computer number format that occupies 32 bytes (256 bits or 64 nibbles) in computer memory. This 256 bit octuple precision is for applications requiring results in higher than quadruple precision. Disclaimer: This format is rarely (if ever) used and very few things support it.
IEEE 754 octuple-precision binary floating-point format: binary128
The IEEE 754 standard specifies a binary128 as having:
- Sign bit: 1 bit
- Exponent width: 18 bits
- Significand precision: 238 bits (237 explicitly stored)
This gives from 33 - 36 significant decimal digits precision (if a decimal string with at most 33 significant decimal is converted to IEEE 754 octuple precision and then converted back to the same number of significant decimal, then the final string should match the original; and if an IEEE 754 octuple precision is converted to a decimal string with at least 36 significant decimal and then converted back to octuple, then the final number must match the original [1]).
The format is written with an implicit lead bit with value 1 unless the exponent is stored with all zeros. Thus only 112 bits of the significand appear in the memory format, but the total precision is 113 bits (approximately 34 decimal digits, ). The bits are laid out as follows: [[|thumbnail|center]]
Exponent encoding
The octuple-precision binary floating-point exponent is encoded using an offset binary representation, with the zero offset being 262143; also known as exponent bias in the IEEE 754 standard.
- Emin = −262143
- Emax = 262143
- Exponent bias = 3FFF16 = 16383
Thus, as defined by the offset binary representation, in order to get the true exponent the offset of 16383 has to be subtracted from the stored exponent.
The stored exponents 000016 and 7FFF16 are interpreted specially.
Exponent | Significand zero | Significand non-zero | Equation |
---|---|---|---|
000016 | 0, −0 | subnormal numbers | |
000116, ..., 7FFE16 | normalized value | ||
7FFF16 | ±∞ | NaN (quiet, signalling) |
The minimum strictly positive (subnormal) value is 2−16494 ≈ 10−4965 and has a precision of only one bit. The minimum positive normal value is 2−16382 ≈ 3.3621 × 10−4932 and has a precision of 112 bits, i.e. ±2 −16494 as well. The maximum representable value is 216384 - 216272 ≈ 1.1897 × 104932.
Octuple-precision examples
These examples are given in bit representation, in hexadecimal, of the floating-point value. This includes the sign, (biased) exponent, and significand.
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 = 0 8000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 = -0
7fff 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 = infinity ffff 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 = -infinity
By default, 1/3 rounds down like double precision, because of the odd number of bits in the significand.
So the bits beyond the rounding point are 0101...
which is less than 1/2 of a unit in the last place.
Implementations
Octuple precision is rarely if ever implemented in to software since usage of it is extremely rare. One can use general arbitrary-precision arithmetic libraries to obtain octuple (or higher) precision, but specialized octruple-precision implementations may achieve higher performance.
Computer-language support
In C++, It is possible to make a library to handle Octuple-precision floating-point arithmetic. Controversially, in theory it is possible to do Octuple-precision floating-point arithmic in binary (but it would be incredibly hard, painful torture).
Hardware support
There is little to no hardware support for octuple precision arithmetic.
See also
- IEEE Standard for Floating-Point Arithmetic (IEEE 754)
- Extended precision (80-bit)
- ISO/IEC 10967, Language Independent Arithmetic
- Primitive data type
- long double
- Single-precision floating point format
- Octuple-precision_floating-point_format
- Half-precision_floating-point_format
- Double-precision_floating-point_format
- Quadruple-precision_floating-point_format
References
- ^ William Kahan (1 October 1987). "Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic" (PDF).
Source For The Layout Of This Entire Page
- The entire layout of this page is based off of Quadruple-precision floating-point format/