Content deleted Content added
structured explanation of sign, exponent and significand bits Tags: Reverted Visual edit |
|||
Line 2:
{{Floating-point}}
{{Computer architecture bit widths}}
In [[computing]], '''quadruple precision''' (or '''quad precision''') is a binary [[Floating-point arithmetic|floating-point]]–based [[computer number format]] that occupies 16 bytes (128 bits)
This 128-bit quadruple precision is designed not only for applications requiring results in higher than double precision,<ref>{{cite web |last1=Bailey |first1=David H. |last2=Borwein |first2=Jonathan M. |date=July 6, 2009 |title=High-Precision Computation and Mathematical Physics |url=https://www.davidhbailey.com/dhbpapers/dhb-jmb-acat08.pdf}}</ref> but also, as a primary function, to allow the computation of double precision results more reliably and accurately by minimising overflow and [[round-off error]]s in intermediate calculations and scratch variables. [[William Kahan]], primary architect of the original IEEE 754 floating-point standard noted, "For now the [[extended precision#x86 Architecture Extended Precision Format|10-byte Extended format]] is a tolerable compromise between the value of extra-precise arithmetic and the price of implementing it to run fast; very soon two more bytes of precision will become tolerable, and ultimately a 16-byte format ... That kind of gradual evolution towards wider precision was already in view when [[IEEE 754|IEEE Standard 754 for Floating-Point Arithmetic]] was framed."<ref>{{cite book
In [[IEEE 754-2008]] the 128-bit base-2 format is officially referred to as '''binary128'''.
== IEEE 754 quadruple-precision binary floating-point format: binary128 ==
Line 21 ⟶ 15:
* [[Significand]] [[precision (arithmetic)|precision]]: 113 bits (112 explicitly stored)
<!-- "significand", with a d at the end, is a technical term, please do not confuse with "significant" -->
* The sign bit determines the sign of the number (including when this number is zero, which is [[Signed zero|signed]]), "1" stands for negative.▼
▲
This gives from 33 to 36 significant decimal digits precision. If a decimal string with at most 33 significant digits is converted to the IEEE 754 quadruple-precision format, giving a normal number, and then converted back to a decimal string with the same number of digits, the final result should match the original string. If an IEEE 754 quadruple-precision number is converted to a decimal string with at least 36 significant digits, and then converted back to quadruple-precision representation, the final result must match the original number.<ref name="whyieee">{{cite web |author=Kahan |first=Wiliam |date=1 October 1987 |title=Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic |url=http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF}}</ref>
Line 189 ⟶ 167:
Quadruple-precision (128-bit) hardware implementation should not be confused with "128-bit FPUs" that implement [[Single instruction, multiple data|SIMD]] instructions, such as [[Streaming SIMD Extensions]] or [[AltiVec]], which refers to 128-bit [[Vector processor|vectors]] of four 32-bit single-precision or two 64-bit double-precision values that are operated on simultaneously.
▲[[William Kahan]], primary architect of the original IEEE 754 floating-point standard noted, "For now the [[extended precision#x86 Architecture Extended Precision Format|10-byte Extended format]] is a tolerable compromise between the value of extra-precise arithmetic and the price of implementing it to run fast; very soon two more bytes of precision will become tolerable, and ultimately a 16-byte format ... That kind of gradual evolution towards wider precision was already in view when [[IEEE 754|IEEE Standard 754 for Floating-Point Arithmetic]] was framed."<ref>{{cite book |last=Higham |first=Nicholas |title="Designing stable algorithms" in Accuracy and Stability of Numerical Algorithms (2 ed) |publisher=SIAM |year=2002 |pages=43}}</ref>
== See also ==
* [[IEEE 754]], IEEE standard for floating-point arithmetic
* [[ISO/IEC 10967]], Language independent arithmetic
* [[Primitive data type]]
|