Revision as of 11:41, 6 December 2024 edit Pausch (talk \| contribs) 428 edits →Floating-point arithmetic ← Previous edit		Revision as of 14:13, 6 December 2024 edit undo Zac67 (talk \| contribs) Extended confirmed users 11,933 edits ce Next edit →
Line 55: \|- \|} [[Floating-point arithmetic]] is needed for very large or very small [[real number]]s, or computations that require a large dynamic range. Floating-point representation is similar to scientific notation, except ~~everything~~computers ~~usually~~use is[[Binary ~~carried~~number\|base ~~out~~two]] in(with ~~base~~rare ~~two~~exceptions), rather than [[Decimal\|base ten~~, for efficiency reasons~~]]. The encoding scheme stores the sign, the [[exponent]] (in base two for Cray and [[VAX]], base two or ten for [[IEEE floating point]] formats, and base 16 for [[IBM hexadecimal floating-point\|IBM Floating Point Architecture]]) and the [[significand]] (number after the [[radix point]]). While several similar formats are in use, the most common is [[IEEE 754-1985\|ANSI/IEEE Std. 754-1985]]. This standard defines the format for 32-bit numbers called ''single precision'', as well as 64-bit numbers called ''double precision'' and longer numbers called ''extended precision'' (used for intermediate results). Floating-point representations can support a much wider range of values than fixed-point, with the ability to represent very small numbers and very large numbers.<ref>[http://www.dspguide.com/ch4/3.htm Floating Point] Retrieved on December 25, 2009.</ref> ===Dynamic range and precision===

Floating point operations per second: Difference between revisions