Content deleted Content added
mNo edit summary |
|||
Line 23:
==Fixed Point Operations, Precision Loss, and Overflow==
Fixed point operations assume that the binary point in a number is always between two specific bits in the field. For instance, in the Q15 format, it is assumed that there are 15 bits to the right of the binary point. This assumption is important when performing fixed point operations which are likely to produce values with more bits than either of the operands (for instance, multiplication, where the product could potentially have as many bits as the sum of the number of bits in the two operands). In this case, the answer will likely be [[rounded]] or [[truncated]] to fit into the same number of bits as the operands. If this is the case, the choice of which bits to keep is very important. When multiplying two fixed point numbers with the same format, for instance with I integer bits, and Q fractional bits, the answer could have up to 2*I integer bits, and 2*Q number of fractional bits. In general, most fixed-point proccessors keep the middle bits; the I-number of least significant integer bits, and the Q-number of most significant fractional bits. Fractional bits lost below this value represent a precision loss which is common in fractional multiplication. If non-zero integer bits are lost, however, the value will likely be radically inaccurate. This is simply considered to be an [[overflow]], and usually causes an [[overflow flag]] to be set somewhere in the proccessors to indicate that the value is incorrect
==Examples still in common use==
|