Quadruple-precision floating-point format: Difference between revisions

Content deleted Content added
light copyedit mainly focused on citations
Citation bot (talk | contribs)
Add: authors 1-2. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Dominic3203 | Category:Binary arithmetic | #UCB_Category 69/100
Line 4:
In [[computing]], '''quadruple precision''' (or '''quad precision''') is a binary [[Floating-point arithmetic|floating-point]]–based [[computer number format]] that occupies 16 bytes (128 bits) with precision at least twice the 53-bit [[Double-precision floating-point format|double precision]].
 
This 128-bit quadruple precision is designed not only for applications requiring results in higher than double precision,<ref>{{cite web |author1last1=Bailey |firstfirst1=David H. |author2last2=Borwein |first2=Jonathan M. |date=July 6, 2009 |title=High-Precision Computation and Mathematical Physics |url=https://www.davidhbailey.com/dhbpapers/dhb-jmb-acat08.pdf}}</ref> but also, as a primary function, to allow the computation of double precision results more reliably and accurately by minimising overflow and [[round-off error]]s in intermediate calculations and scratch variables. [[William Kahan]], primary architect of the original IEEE 754 floating-point standard noted, "For now the [[extended precision#x86 Architecture Extended Precision Format|10-byte Extended format]] is a tolerable compromise between the value of extra-precise arithmetic and the price of implementing it to run fast; very soon two more bytes of precision will become tolerable, and ultimately a 16-byte format ... That kind of gradual evolution towards wider precision was already in view when [[IEEE 754|IEEE Standard 754 for Floating-Point Arithmetic]] was framed."<ref>{{cite book|first=Nicholas | last=Higham |title="Designing stable algorithms" in Accuracy and Stability of Numerical Algorithms (2 ed)| publisher=SIAM|year=2002 | pages=43 }}</ref>
 
In [[IEEE 754-2008]] the 128-bit base-2 format is officially referred to as '''binary128'''.
Line 132:
 
=== Hardware support ===
IEEE quadruple precision was added to the [[IBM System/390]] G5 in 1998,<ref>{{cite journal |last1=Schwarz |first1=E. M. |last2=Krygowski |first2=C. A. |date=September 1999 |title=The S/390 G5 floating-point unit |journal=IBM Journal of Research and Development |volume=43 |issue=5/6 |pages=707–721 |doi=10.1147/rd.435.0707 |citeseerx=10.1.1.117.6711 }}</ref> and is supported in hardware in subsequent [[z/Architecture]] processors.<ref>{{cite news |author=Gerwig |firstfirst1=G. |last2=Wetter |first2=H. |last3=Schwarz |first3=E. M. |last4=Haess |first4=J. |last5=Krygowski |first5=C. A. |last6=Fleischer |first6=B. M. |last7=Kroener |first7=M. |date=May 2004 |title=The IBM eServer z990 floating-point unit. IBM J. Res. Dev. 48 |pages=311–322}}</ref><ref>{{cite web |author=Schwarz |first=Eric |date=June 22, 2015 |title=The IBM z13 SIMD Accelerators for Integer, String, and Floating-Point |url=http://arith22.gforge.inria.fr/slides/s1-schwarz.pdf |access-date=July 13, 2015}}</ref> The IBM [[POWER9]] CPU ([[Power ISA#Power ISA v.3.0|Power ISA 3.0]]) has native 128-bit hardware support.<ref name=gcc6changes/>
 
Native support of IEEE 128-bit floats is defined in [[PA-RISC]] 1.0,<ref>{{cite web |url=http://grouper.ieee.org/groups//754/email/msg04128.html |title=Implementor support for the binary interchange formats |website=[[IEEE]] |archive-url=https://web.archive.org/web/20171027202715/https://grouper.ieee.org/groups//754/email/msg04128.html |archive-date=2017-10-27 |access-date=2021-07-15}}</ref> and in [[SPARC]] V8<ref>{{cite book