Floating point operations per second: Difference between revisions

Content deleted Content added
Pausch (talk | contribs)
ce
Line 55:
|-
|}
[[Floating-point arithmetic]] is needed for very large or very small [[real number]]s, or computations that require a large dynamic range. Floating-point representation is similar to scientific notation, except everythingcomputers usuallyuse is[[Binary carriednumber|base outtwo]] in(with baserare twoexceptions), rather than [[Decimal|base ten, for efficiency reasons]]. The encoding scheme stores the sign, the [[exponent]] (in base two for Cray and [[VAX]], base two or ten for [[IEEE floating point]] formats, and base 16 for [[IBM hexadecimal floating-point|IBM Floating Point Architecture]]) and the [[significand]] (number after the [[radix point]]). While several similar formats are in use, the most common is [[IEEE 754-1985|ANSI/IEEE Std. 754-1985]]. This standard defines the format for 32-bit numbers called ''single precision'', as well as 64-bit numbers called ''double precision'' and longer numbers called ''extended precision'' (used for intermediate results). Floating-point representations can support a much wider range of values than fixed-point, with the ability to represent very small numbers and very large numbers.<ref>[http://www.dspguide.com/ch4/3.htm Floating Point] Retrieved on December 25, 2009.</ref>
 
===Dynamic range and precision===