Content deleted Content added
m Reverted 1 edit by 2600:1016:A010:46F4:D306:8040:627C:7135 (talk) to last revision by Zinnober9 |
|||
(4 intermediate revisions by 3 users not shown) | |||
Line 1:
{{short description|32-bit computer number format}}
{{Cleanup|reason=<br/>{{*}} This article doesn't provide a good structure to lead users from easy to deeper understanding<br/>{{*}} Some points are 'explained' by lengthy examples instead of concise description of the concept
▲* Some points are 'explained' by lengthy examples instead of concise description of the concept<br />|date=January 2025}}
'''Single-precision floating-point format''' (sometimes called '''FP32''' or '''float32''') is a [[computer number format]], usually occupying [[32 bits]] in [[computer memory]]; it represents a wide [[dynamic range]] of numeric values by using a [[floating point|floating radix point]].
Line 13 ⟶ 10:
One of the first [[programming language]]s to provide single- and double-precision floating-point data types was [[Fortran]]. Before the widespread adoption of IEEE 754-1985, the representation and properties of floating-point data types depended on the [[computer manufacturer]] and computer model, and upon decisions made by programming-language designers. E.g., [[GW-BASIC]]'s single-precision data type was the [[32-bit MBF]] floating-point format.
Single precision is termed ''REAL(4)'' or ''REAL*4'' in [[Fortran]];<ref>{{cite web|url=http://scc.ustc.edu.cn/zlsc/sugon/intel/compiler_f/main_for/lref_for/source_files/rfreals.htm|title=REAL Statement|website=scc.ustc.edu.cn|access-date=2013-02-28|archive-date=2021-02-24|archive-url=https://web.archive.org/web/20210224045812/http://scc.ustc.edu.cn/zlsc/sugon/intel/compiler_f/main_for/lref_for/source_files/rfreals.htm|url-status=dead}}</ref> ''SINGLE-FLOAT'' in [[Common Lisp]];<ref>{{Cite web|url=https://www.lispworks.com/documentation/HyperSpec/Body/t_short_.htm|title=CLHS: Type SHORT-FLOAT, SINGLE-FLOAT, DOUBLE-FLOAT...|website=www.lispworks.com}}</ref> ''float binary(p)'' with p≤21, ''float decimal(p)'' with the maximum value of p depending on whether the DFP (IEEE 754 DFP) attribute applies, in PL/I; ''float'' in [[C (programming language)|C]] with IEEE 754 support, [[C++]] (if it is in C), [[C Sharp (programming language)|C#]] and [[Java (programming language)|Java]];<ref>{{cite web|url=https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html|title=Primitive Data Types|website=Java Documentation}}</ref> ''Float'' in [[Haskell (programming language)|Haskell]]<ref>{{cite web|url=https://www.haskell.org/onlinereport/haskell2010/haskellch6.html#x13-1350006.4|title=6 Predefined Types and Classes|date=20 July 2010|website=haskell.org}}</ref> and [[Swift (programming language)|Swift]];<ref>{{cite web|url=https://developer.apple.com/documentation/swift/float|title=Float|website=Apple Developer Documentation}}</ref> and ''Single'' in [[Object Pascal]] ([[Delphi (programming language)|Delphi]]), [[Visual Basic]], and [[MATLAB]]. However, ''float'' in [[Python (programming language)|Python]], [[Ruby (programming language)|Ruby]], [[PHP]], and [[OCaml]] and ''single'' in versions of [[GNU Octave|Octave]] before 3.2 refer to [[double-precision floating-point format|double-precision]] numbers. In most implementations of [[PostScript]], and some [[embedded systems]], the only supported precision is single.
{{Floating-point}}
Line 67 ⟶ 64:
The stored exponents 00<sub>H</sub> and FF<sub>H</sub> are interpreted specially.
{|class="wikitable" style="text-align: center;"
|-
! Exponent !! fraction = 0 !! fraction ≠ 0 !! Equation
|-
Line 229 ⟶ 227:
These examples are given in bit ''representation'', in [[hexadecimal]] and [[Binary number|binary]], of the floating-point value. This includes the sign, (biased) exponent, and significand.
{| style="font-family: monospace, monospace;"
|-
0 00000000 00000000000000000000001<sub>2</sub> = 0000 0001<sub>16</sub> = 2<sup>−126</sup> × 2<sup>−23</sup> = 2<sup>−149</sup> ≈ 1.4012984643 × 10<sup>−45</sup>▼
|
(smallest positive subnormal number)▼
▲0 00000000 00000000000000000000001<sub>2</sub> = 0000 0001<sub>16</sub> = 2<sup>−126</sup> × 2<sup>−23</sup> = 2<sup>−149</sup> ≈ 1.4012984643 × 10<sup>−45</sup><br />
0 00000000 11111111111111111111111<sub>2</sub> = 007f ffff<sub>16</sub> = 2<sup>−126</sup> × (1 − 2<sup>−23</sup>) ≈ 1.1754942107 ×10<sup>−38</sup><br />▼
{{spaces|38}}(largest subnormal number)
▲0 00000000 11111111111111111111111<sub>2</sub> = 007f ffff<sub>16</sub> = 2<sup>−126</sup> × (1 − 2<sup>−23</sup>) ≈ 1.1754942107 ×10<sup>−38</sup>
0 00000001 00000000000000000000000<sub>2</sub> = 0080 0000<sub>16</sub> = 2<sup>−126</sup> ≈ 1.1754943508 × 10<sup>−38</sup><br />▼
{{spaces|38}}(smallest positive normal number)
▲0 00000001 00000000000000000000000<sub>2</sub> = 0080 0000<sub>16</sub> = 2<sup>−126</sup> ≈ 1.1754943508 × 10<sup>−38</sup>
0 11111110 11111111111111111111111<sub>2</sub> = 7f7f ffff<sub>16</sub> = 2<sup>127</sup> × (2 − 2<sup>−23</sup>) ≈ 3.4028234664 × 10<sup>38</sup><br />▼
{{spaces|38}}(largest normal number)
▲0 11111110 11111111111111111111111<sub>2</sub> = 7f7f ffff<sub>16</sub> = 2<sup>127</sup> × (2 − 2<sup>−23</sup>) ≈ 3.4028234664 × 10<sup>38</sup>
0 01111110 11111111111111111111111<sub>2</sub> = 3f7f ffff<sub>16</sub> = 1 − 2<sup>−24</sup> ≈ 0.999999940395355225<br />▼
{{spaces|38}}(largest number less than one)
▲0 01111110 11111111111111111111111<sub>2</sub> = 3f7f ffff<sub>16</sub> = 1 − 2<sup>−24</sup> ≈ 0.999999940395355225
0 01111111 00000000000000000000000<sub>2</sub> = 3f80 0000<sub>16</sub> = 1 (one)
0 01111111 00000000000000000000001<sub>2</sub> = 3f80 0001<sub>16</sub> = 1 + 2<sup>−23</sup> ≈ 1.00000011920928955<br />
▲0 00000000 00000000000000000000000<sub>2</sub> = 0000 0000<sub>16</sub> = 0
1 00000000 00000000000000000000000<sub>2</sub> = 8000 0000<sub>16</sub> = −0
0 11111111 00000000000000000000000<sub>2</sub> = 7f80 0000<sub>16</sub> = infinity<br />▼
▲0 11111111 00000000000000000000000<sub>2</sub> = 7f80 0000<sub>16</sub> = infinity
1 11111111 00000000000000000000000<sub>2</sub> = ff80 0000<sub>16</sub> = −infinity
0 10000000 10010010000111111011011<sub>2</sub> = 4049 0fdb<sub>16</sub> ≈ 3.14159274101257324 ≈ π (
▲0 10000000 10010010000111111011011<sub>2</sub> = 4049 0fdb<sub>16</sub> ≈ 3.14159274101257324 ≈ π ( pi )
0 01111101 01010101010101010101011<sub>2</sub> = 3eaa aaab<sub>16</sub> ≈ 0.333333343267440796 ≈ 1/3
x 11111111 10000000000000000000001<sub>2</sub> = ffc0 0001<sub>16</sub> = qNaN (on x86 and ARM processors)<br />▼
▲<pre>
▲x 11111111 10000000000000000000001<sub>2</sub> = ffc0 0001<sub>16</sub> = qNaN (on x86 and ARM processors)
x 11111111 00000000000000000000001<sub>2</sub> = ff80 0001<sub>16</sub> = sNaN (on x86 and ARM processors)
|}
By default, 1/3 rounds up, instead of down like [[Double-precision floating-point format|double-precision]], because of the even number of bits in the significand. The bits of 1/3 beyond the rounding point are <code>1010...</code> which is more than 1/2 of a [[unit in the last place]].
|