Decimal64 floating-point format: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 17:31, 8 December 2024 edit 176.4.128.38 (talk) nicer table for BID encoding, WIP, e.g. misleading table before has to be removed as only valid for BID, not DPD, and DPD table should also be improved, pls. do NOT revert for silly nitpicking, edits ike these are time consuming work! Tag: Visual edit ← Previous edit		Latest revision as of 14:33, 25 August 2025 edit undo Citation bot (talk \| contribs) Bots 5,865,984 edits Removed URL that duplicated identifier. Removed access-date with no URL. \| Use this bot. Report bugs. \| Suggested by Headbomb \| Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox \| #UCB_webform_linked 629/1032
(45 intermediate revisions by 18 users not shown)
Line 1: {{Short description\|64-bit computer number format}} {{lowercase title}} {{Use dmy dates\|date=July 2020\|cs1-dates=y}} {{floating-point}} In [[computing]], '''decimal64''' is a [[decimal floating point\|decimal floating-point]] [[computer ~~numbering~~number format]] that occupies 8 bytes (64 bits) in computer memory. It is intended for applications where it is requested to come near to schoolhouse math. In contrast to the binaryxxx datatypes the decimalxxx datatypes provide exact calculations '''also with decimal fractions''' and 'nearest, ties away from zero' rounding, in some range, to some precision, to some degree. Decimal64 ~~floating point~~ is a ~~relatively new~~ decimal floating-point format, formally introduced in the [[IEEE 754-2008 revision\|2008 ~~version~~revision]]<ref name="IEEE-754_2008">{{cite book \|title=IEEE Standard for Floating-Point Arithmetic \|author=IEEE Computer Society \|date=2008-08-29 \|publisher=[[IEEE]] \|id=IEEE Std 754-2008 \|doi=10.1109/IEEESTD.2008.4610935 \|ref=CITEREFIEEE_7542008 \|isbn=978-0-7381-5753-5 ~~\|url=https://ieeexplore.ieee.org/document/4610935 \|access-date=2016-02-08~~}}</ref> of the [[IEEE 754]] asstandard, ~~well~~also known as ~~with [[~~ISO/IEC/IEEE 60559:2011]].<ref name="ISO-60559_2011">{{~~cite~~Cite ~~journal~~book \|last=ISO/IEC JTC 1/SC 25\|title=ISO/IEC/IEEE 60559:2011 — Information technology — Microprocessor Systems — Floating-Point arithmetic \|url=~~http~~https://www.iso.org/~~iso~~standard/~~iso_catalogue/catalogue_tc/catalogue_detail~~57469.~~htm?csnumber~~html \|publisher=~~57469~~ISO \|~~date~~pages=~~2011~~1–58 \|~~access-~~date=~~2016-02-08~~June 2011}}</ref> == Format == Decimal64 supports 'normal' values that can have 16 digit precision from {{gaps\|±1.000\|000\|000\|000\|000\|e=-383}} to {{gaps\|±9.999\|999\|999\|999\|999\|e=384}}, plus 'denormal' values with ramp-down relative precision down to ±1.×10<sup>−398</sup>, [[signed zero]]s, signed infinities and [[NaN]] (Not a Number). This format supports two different encodings. Decimal64 supports 16 [[decimal digit]]s of [[significand]] and an [[exponent]] range of −383 to +384, i.e. {{gaps\|±0.000\|000\|000\|000\|000\|e=-383}} to {{gaps\|±9.999\|999\|999\|999\|999\|e=384}}. (Equivalently, {{gaps\|±0\|000\|000\|000\|000\|000\|e=-398}} to {{gaps\|±9\|999\|999\|999\|999\|999\|e=369}}.) In contrast, the corresponding binary format, which is the most commonly used type, has an approximate range of {{gaps\|±0.000\|000\|000\|000\|001\|e=-308}} to {{gaps\|±1.797\|693\|134\|862\|315\|e=308}}. Because the significand is not normalized, most values with less than 16 [[significant digits]] have multiple possible representations; {{gaps\|1 × 10<sup>2</sup>\|{{=}}\|0.1 × 10<sup>3</sup>\|{{=}}\|0.01 × 10<sup>4</sup>}}, etc. This set of representations for a same value is called a ''[[Cohort (floating point)\|cohort]]''. Zero has 768 possible representations (1536 if both [[signed zero]]s are included, in two different cohorts). The binary format of the same size supports a range from denormal-min {{gaps\|±5\|\|\|\|\|e=-324\|}}, over normal-min with full 53-bit precision {{gaps\|±2.225\|073\|858\|507\|201\|e=-308\|4}} to max {{gaps\|±1.797\|693\|134\|862\|315\|e=+308\|7}}. Because the significand for the [[IEEE 754]] decimal formats is not normalized, most values with less than 16 [[significant digits]] have multiple possible representations; 1000000 × 10<sup>−2</sup>=100000 × 10<sup>−1</sup>=10000 × 10<sup>0</sup>=1000 × 10<sup>1</sup> all have the value 10000. These sets of representations for a same value are called ''[[Cohort (floating point)\|cohorts]]'', the different members can be used to denote how many digits of the value are known precisely. Each signed zero has 768 possible representations (1536 for all zeros, in two different cohorts). == Encoding of decimal64 values == Line 29 ⟶ 32: Both alternatives provide exactly the same set of representable numbers: 16 digits of significand and {{math\|size=100%\|1=3 × 2<sup>8</sup> = 768}} possible decimal exponent values. (All the possible decimal exponent values storable in a [[binary64]] number are representable in decimal64, and most bits of the significand of a binary64 are stored keeping roughly the same number of decimal digits in the significand.) In both cases, the most significant 4 bits of the significand (which actually only have 10 possible values) are combined with ~~the most significant 2~~two bits of the exponent (3 possible values) to use 30 of the 32 possible values of a 5-bit field. The remaining combinations encode [[infinity\|infinities]] and [[NaN]]s. BID and DPD use different bits of the combination field for that. ~~{\| class="wikitable"~~ \|-▼ ~~! Combination field !! Exponent \|\| Significand Msbits !! Other~~ \|-▼ ~~\| {{mono\|00mmmmmmmmmmm}} \|\| {{mono\|00xxxxxxxx}} \|\| {{mono\|0ccc}} \|\| {{sdash}}~~ \|-▼ ~~\| {{mono\|01mmmmmmmmmmm}} \|\| {{mono\|01xxxxxxxx}} \|\| {{mono\|0ccc}} \|\| {{sdash}}~~ \|-▼ ~~\| {{mono\|10mmmmmmmmmmm}} \|\| {{mono\|10xxxxxxxx}} \|\| {{mono\|0ccc}} \|\| {{sdash}}~~ \|-▼ ~~\| {{mono\|1100mmmmmmmmm}} \|\| {{mono\|00xxxxxxxx}} \|\| {{mono\|100c}} \|\| {{sdash}}~~ \|-▼ ~~\| {{mono\|1101mmmmmmmmm}} \|\| {{mono\|01xxxxxxxx}} \|\| {{mono\|100c}} \|\| {{sdash}}~~ \|-▼ ~~\| {{mono\|1110mmmmmmmmm}} \|\| {{mono\|10xxxxxxxx}} \|\| {{mono\|100c}} \|\| {{sdash}}~~ \|-▼ ~~\| {{mono\|11110mmmmmmmm}} \|\| {{sdash}} \|\| {{sdash}} \|\| ±Infinity~~ \|-▼ ~~\| {{mono\|11111mmmmmmmm}} \|\| {{sdash}} \|\| {{sdash}} \|\| {{mono\|NaN}}. Sign bit ignored. Sixth bit of the combination field determines if NaN is signaling.~~ \|}▼ In the cases of Infinity and NaN, all other bits of the encoding are ignored. Thus, it is possible to initialize an array to Infinities or NaNs by filling it with a single byte value. === Binary integer significand field === This format uses a binary significand from 0 to {{math\|size=100%\|1=10<sup>16</sup> − 1 = {{gaps\|9\|999\|999\|999\|999\|999}} = 2386F26FC0FFFF<sub>16</sub> = {{gaps\|1000\|1110000110\|1111001001\|1011111100\|0000111111\|1111111111<sub>2</sub>}}.}}The encoding, completely stored on 64 bits, can represent binary significands up to {{math\|size=100%\|1=10 × 2<sup>50</sup> − 1 = {{gaps\|11\|258\|999\|068\|426\|239}} = 27FFFFFFFFFFFF<sub>16</sub>,}} but values larger than {{math\|size=100%\|1=10<sup>16</sup> − 1}} are illegal (and the standard requires implementations to treat them as 0, if encountered on input). The encoding, completely stored on 64 bits, can represent binary significands up to {{math\|size=100%\|1=10 × 2<sup>50</sup> − 1 = {{gaps\|11\|258\|999\|068\|426\|239}} = 27FFFFFFFFFFFF<sub>16</sub>,}} but values larger than {{math\|size=100%\|1=10<sup>16</sup> − 1}} are illegal (and the standard requires implementations to treat them as 0, if encountered on input). As described above, the encoding varies depending on whether the most significant {{val\|4\|u=bits}} of the significand are in the range 0 to 7 (0000<sub>2</sub> to 0111<sub>2</sub>), or higher (1000<sub>2</sub> or 1001<sub>2</sub>). Line 68 ⟶ 48: \|+ BID Encoding \|- ! colspan="1113" \| Combination Field ! ! ! rowspan="2" \| ! rowspan="2" \| Exponent ! rowspan="2" \|Significand / Description ! rowspan="2" \| Description▼ \|- ! g12 !! g11 !! g10 !! g9 !! g8 !! g7 !! g6 !! g5 !! g4 !! g3 !! g2 Line 80 ⟶ 57: !g0 \|- \| colspan="1716" \|combination field not! starting with '11', bits ab = 00, 01 or 10 \|- \| style="font-family:monospace; background:#cedff2;" \| '''a''' \|\| style="font-family:monospace; background:#cedff2;" \| '''b''' \|\| style="font-family:monospace; background:#cedff2;" \| '''c''' \|\| style="font-family:monospace; background:#cedff2;" \| '''d''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cef2e0;" \| '''e''' \|\| style="font-family:monospace; background:#cef2e0;" \|'''f''' \|\| style="font-family:monospace; background:#cef2e0;" \|'''g''' \| \|\| style="font-family:monospace; background:#cedff2;" \| '''abcdmmmmmm''' \|\| style="background:#cef2e0;" \| {{mono\|(0)'''efgtttttttttttttttttttttttttttttttttttttttttttttttttt''' \|\| }} Finite number with small first digit of significand (0 ….. 7). \|- \| colspan="1716" \|combination field starting with '11', but not 1111, bits ab = 11, bits cd = 00, 01 or 10 \|- \| 1 \|\| 1 \|\| style="font-family:monospace; background:#cedff2;" \| '''c'''\|\| style="font-family:monospace; background:#cedff2;" \| '''d''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''e''' \|\| style="font-family:monospace; background:#cedff2;" \| '''f''' \|\| style="font-family:monospace; background:#cef2e0;" \| '''g''' \| \|\| style="font-family:monospace; background:#cedff2;" \| '''cdmmmmmmef''' \|\| style="background:#cef2e0;" \| {{mono\|'''100gtttttttttttttttttttttttttttttttttttttttttttttttttt''' \|\| }} Finite number with big first digit of significand (8 or 9). \|- \| colspan="1716" \|combination field starting with '1111', bits abcd = 1111 \|- \| 1 \|\| 1 \|\| 1 \|\| 1 \|\| 0 \|\| colspan="68" \| \| rowspan="3" ~~\| \|~~\| \|\| \|\| ±Infinity▼ \|▼ \|▼ ▲\| rowspan="3" \| \|\| \|\| \|\| ±Infinity \|- \| 1 \|\| 1 \|\| 1 \|\| 1 \|\| 1 \|\|0 \| colspan="57" \| \|▼ \|▼ \| ~~\| \|~~\| quiet NaN \|- \|1 Line 110 ⟶ 85: \|1 \|1 \| colspan="57" \| \|▼ \|▼ \|▼ \| \|signaling NaN (with payload in significand) Line 140 ⟶ 112: The last {{val\|50\|u=bits}} are the significand continuation field, consisting of five 10-bit ''[[declet (computing)\|declet]]s''.<ref name="Muller_2010">{{cite book \|author-last1=Muller \|author-first1=Jean-Michel \|author-last2=Brisebarre \|author-first2=Nicolas \|author-last3=de Dinechin \|author-first3=Florent \|author-last4=Jeannerod \|author-first4=Claude-Pierre \|author-last5=Lefèvre \|author-first5=Vincent \|author-last6=Melquiond \|author-first6=Guillaume \|author-last7=Revol \|author-first7=Nathalie\|author7-link=Nathalie Revol \|author-last8=Stehlé \|author-first8=Damien \|author-last9=Torres \|author-first9=Serge \|title=Handbook of Floating-Point Arithmetic \|year=2010 \|publisher=[[Birkhäuser]] \|edition=1 \|isbn=978-0-8176-4704-9<!-- print --> \|doi=10.1007/978-0-8176-4705-6 \|lccn=2009939668<!-- \|isbn=978-0-8176-4705-6 (online), ISBN 0-8176-4704-X (print) -->\|url=https://cds.cern.ch/record/1315760 }}</ref> Each declet encodes three decimal digits<ref name="Muller_2010"/> using the DPD encoding. If the first two bits after the sign bit are "00", "01", or "10", then those are the leading bits of the exponent, and the three bits "~~TTT~~cde" after that are interpreted as the leading decimal digit (0 to 7): If the first two bits after the sign bit are "11", then the second 2-bits are the leading bits of the exponent, and the next bit "Te" is prefixed with implicit bits "100" to form the leading decimal digit (8 or 9):▼ ~~s 00 TTT (00)eeeeeeee (0TTT)[tttttttttt][tttttttttt][tttttttttt][tttttttttt][tttttttttt]~~ ~~s 01 TTT (01)eeeeeeee (0TTT)[tttttttttt][tttttttttt][tttttttttt][tttttttttt][tttttttttt]~~ ~~s 10 TTT (10)eeeeeeee (0TTT)[tttttttttt][tttttttttt][tttttttttt][tttttttttt][tttttttttt]~~ ▲If the first two bits after the sign bit are "11", then the second 2-bits are the leading bits of the exponent, and the next bit "T" is prefixed with implicit bits "100" to form the leading decimal digit (8 or 9): ~~s 1100 T (00)eeeeeeee (100T)[tttttttttt][tttttttttt][tttttttttt][tttttttttt][tttttttttt]~~ ~~s 1101 T (01)eeeeeeee (100T)[tttttttttt][tttttttttt][tttttttttt][tttttttttt][tttttttttt]~~ ~~s 1110 T (10)eeeeeeee (100T)[tttttttttt][tttttttttt][tttttttttt][tttttttttt][tttttttttt]~~ The remaining two combinations (11 110 and 11 111) of the 5-bit field after the sign bit are used to represent ±infinity and NaNs, respectively. {\| class="wikitable" style="text-align:left; border-width:0;" \|+ DPD Encoding ▲\|- ! colspan="13" \| Combination Field ! rowspan="2" \| ! rowspan="2" \| Exponent ▲! rowspan="2" \|Significand / Description ▲\|- ! g12 !! g11 !! g10 !! g9 !! g8 !! g7 !! g6 !! g5 !! g4 !! g3 !! g2 !g1 !g0 ▲\|- \| colspan="16" \|combination field not! starting with '11', bits ab = 00, 01 or 10 ▲\|- \| style="font-family:monospace; background:#cedff2;" \| '''a''' \|\| style="font-family:monospace; background:#cedff2;" \| '''b''' \|\| style="font-family:monospace; background:#cef2e0;" \| '''c''' \|\| style="font-family:monospace; background:#cef2e0;" \| '''d''' \|\| style="font-family:monospace; background:#cef2e0;" \| '''e''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \| \|\| style="font-family:monospace; background:#cedff2;" \| '''abmmmmmmmm'''\|\| style="background:#cef2e0;" \|{{nowrap\|{{mono\|(0)'''cde tttttttttt tttttttttt tttttttttt tttttttttt tttttttttt'''}}}} Finite number with small first digit of significand (0 … 7). ▲\|- \| colspan="16" \|combination field starting with '11', but not 1111, bits ab = 11, bits cd = 00, 01 or 10 ▲\|- \| 1 \|\| 1 \|\| style="font-family:monospace; background:#cedff2;" \| '''c''' \|\| style="font-family:monospace; background:#cedff2;" \| '''d''' \|\| style="font-family:monospace; background:#cef2e0;" \| '''e''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \|\| style="font-family:monospace; background:#cedff2;" \| '''m''' \| \|\| style="font-family:monospace; background:#cedff2;" \| '''cdmmmmmmmm'''\|\| style="background:#cef2e0;" \|{{nowrap\|{{mono\|'''100e tttttttttt tttttttttt tttttttttt tttttttttt tttttttttt'''}}}} Finite number with big first digit of significand (8 or 9). ▲\|- \| colspan="16" \|combination field starting with '1111', bits abcd = 1111 ▲\|- \| 1 \|\| 1 \|\| 1 \|\| 1 \|\| 0 \|\| colspan="8" \| \| rowspan="3" \| \|\| \|\| ±Infinity ▲\|- \| 1 \|\| 1 \|\| 1 \|\| 1 \|\| 1 \|\|0 \| colspan="7" \| ▲\| \| quiet NaN ▲\|- ▲\|1 ▲\|1 ▲\|1 ▲\|1 \|1 \|1 \| colspan="7" \| ▲\| \|signaling NaN (with payload in significand) ▲\|} The DPD/3BCD transcoding for the declets is given by the following table. b9...b0 are the bits of the DPD, and d2...d0 are the three BCD digits.