Quadruple-precision floating-point format: Difference between revisions

Content deleted Content added
never mind. The previous edits broke sub and superscripting. Rv to last working version. Find a different way to display these examples.
Tags: Undo Reverted
Need <includeonly></includeonly> instead of just <includeonly />
Tag: Reverted
Line 50:
These examples are given in bit ''representation'', in [[hexadecimal]], of the floating-point value. This includes the sign, (biased) exponent, and significand.
 
<pre<includeonly></includeonly>>
0000 0000 0000 0000 0000 0000 0000 0001<sub>16</sub> = 2<sup>−16382</sup> × 2<sup>−112</sup> = 2<sup>−16494</sup>
≈ 6.4751751194380251109244389582276465525 × 10<sup>−4966</sup>
6.4751751194380251109244389582276465525 × (smallest positive subnormal number)10<sup>−4966</sup>
(closestsmallest approximationpositive tosubnormal πnumber)
</pre>
 
<pre<includeonly></includeonly>>
0000 ffff ffff ffff ffff ffff ffff ffff<sub>16</sub> = 2<sup>−16382</sup> × (1 − 2<sup>−112</sup>)
≈ 3.3621031431120935062626778173217519551 × 10<sup>−4932</sup>
3.3621031431120935062626778173217519551 × (largest subnormal number)10<sup>−4932</sup>
(closest approximationlargest tosubnormal 1/3number)
</pre>
 
<pre<includeonly></includeonly>>
0001 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = 2<sup>−16382</sup>
≈ 3.3621031431120935062626778173217526026 × 10<sup>−4932</sup>
3.3621031431120935062626778173217526026 × (smallest positive normal number)10<sup>−4932</sup>
0.9999999999999999999999999999999999037(smallest positive normal number)
</pre>
 
<pre<includeonly></includeonly>>
7ffe ffff ffff ffff ffff ffff ffff ffff<sub>16</sub> = 2<sup>16383</sup> × (2 − 2<sup>−112</sup>)
≈ 1.1897314953572317650857593266280070162 × 10<sup>4932</sup>
1.1897314953572317650857593266280070162 × (largest normal number)10<sup>4932</sup>
6.4751751194380251109244389582276465525(largest ×normal 10<sup>−4966</sup>number)
</pre>
 
<pre<includeonly></includeonly>>
3ffe ffff ffff ffff ffff ffff ffff ffff<sub>16</sub> = 1 − 2<sup>−113</sup>
≈ 0.9999999999999999999999999999999999037
(largest number less than one)0.9999999999999999999999999999999999037
3.3621031431120935062626778173217519551(largest ×number 10<sup>−4932</sup>less than one)
</pre>
 
<pre<includeonly></includeonly>>
3fff 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = 1 (one)
</pre>
 
<pre<includeonly></includeonly>>
3fff 0000 0000 0000 0000 0000 0000 0001<sub>16</sub> = 1 + 2<sup>−112</sup>
≈ 1.0000000000000000000000000000000001926
(smallest number larger than one)1.0000000000000000000000000000000001926
3.3621031431120935062626778173217526026(smallest ×number 10<sup>−4932</sup>larger than one)
</pre>
 
<pre<includeonly></includeonly>>
4000 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = 2
c0004000 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = −22
4000c000 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = 2−2
</pre>
 
<pre<includeonly></includeonly>>
0000 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = 0
80000000 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = −00
00008000 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = 0−0
</pre>
 
<pre<includeonly></includeonly>>
7fff 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = infinity
ffff7fff 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = −infinityinfinity
7fffffff 0000 0000 0000 0000 0000 0000 0000<sub>16</sub> = infinity−infinity
</pre>
 
<pre<includeonly></includeonly>>
4000 921f b544 42d1 8469 898c c517 01b8<sub>16</sub> ≈ 3.1415926535897932384626433832795027975
(closest approximation to π)
1.0000000000000000000000000000000001926(closest approximation to π)
</pre>
 
<pre<includeonly></includeonly>>
3ffd 5555 5555 5555 5555 5555 5555 5555<sub>16</sub> ≈ 0.3333333333333333333333333333333333173
(closest approximation to 1/3)
(closest approximation to 1/3)
</pre>
 
By default, 1/3 rounds down like [[double precision]], because of the odd number of bits in the significand. Thus, the bits beyond the rounding point are <code>0101...</code> which is less than 1/2 of a [[unit in the last place]].
Line 108 ⟶ 132:
A similar technique can be used to produce a '''double-quad arithmetic''', which is represented as a sum of two quadruple-precision values. They can represent operations with at least 226 (or 227) bits.<ref>sourceware.org [http://sourceware.org/ml/libc-alpha/2012-03/msg01024.html Re: The state of glibc libm]</ref>
 
== Implementations ==
Quadruple precision is often implemented in software by a variety of techniques (such as the double-double technique above, although that technique does not implement IEEE quadruple precision), since direct hardware support for quadruple precision is, {{as of|2016|lc=on}}, less common (see "[[#Hardware support|Hardware support]]" below). One can use general [[arbitrary-precision arithmetic]] libraries to obtain quadruple (or higher) precision, but specialized quadruple-precision implementations may achieve higher performance.
 
=== Computer-language support ===
A separate question is the extent to which quadruple-precision types are directly incorporated into computer [[programming language]]s.
 
Line 128 ⟶ 152:
As of 2024, [[Rust (programming language)|Rust]] is currently working on adding a new <code>f128</code> type for IEEE quadruple-precision 128-bit floats.<ref>{{cite web |last1=Cross |first1=Travis |title=Tracking Issue for f16 and f128 float types |url=https://github.com/rust-lang/rust/issues/116909 |website=GitHub |access-date=2024-07-05}}</ref>
 
=== Libraries and toolboxes ===
* The [[GNU Compiler Collection|GCC]] quad-precision math library, [https://gcc.gnu.org/onlinedocs/libquadmath libquadmath], provides <code>__float128</code> and <code>__complex128</code> operations.
* The [[Boost (C++ libraries)|Boost]] multiprecision library Boost.Multiprecision provides unified cross-platform C++ interface for <code>__float128</code> and <code>_Quad</code> types, and includes a custom implementation of the standard math library.<ref>{{cite web |title=Boost.Multiprecision – float128 |url=http://www.boost.org/doc/libs/1_58_0/libs/multiprecision/doc/html/boost_multiprecision/tut/floats/float128.html |access-date=2015-06-22}}</ref>