Revision as of 11:32, 21 February 2018 edit Robin48gx (talk \| contribs) 134 edits →Overview ← Previous edit		Revision as of 11:32, 21 February 2018 edit undo Robin48gx (talk \| contribs) 134 edits →Overview Next edit →
Line 11: To give an example, a common way to use [[Arbitrary-precision arithmetic\|integer arithmetic]] to simulate floating point, using 32 bit numbers, is to multiply the coefficients by 65536. Using [[binary scientific notation]], this will place the binary point at B16. That is to say there are 16 binary integer bits and the remainder are fractional. This means, as a signed two's complement integer B16 number can hold a highest value of <math> \approx 32767.999 </math> and a lowest value of <math> -32768.0</math> . Put another way, the B number, is the number of integer bits used to represent the number which defines its value range. Remaining ~~bit~~low bits (i.e. the non-integer bits) are used to store fractional quantities and supply more accuracy. For instance, to represent 1.2 and 5.6 floating point real numbers as B16 one multiplies them by 2<sup>16</sup>, giving 78643 and 367001.

Binary scaling: Difference between revisions