IQmath
The traditional approach to performing math operations, using fixed-point numerical techniques
can be demonstrated using a simple linear equation example. The floating-point code for a linear
equation would be:
float Y, M, X, B;
Y = M * X + B;
For the fixed-point implementation, assume all data is 32-bits, and that the "Q" value, or location
of the binary point, is set to 24 fractional bits (Q24). The numerical range and resolution for a
32-bit Q24 number is as follows:
Q value Min Value Max Value Resolution
Q24 -2
(32-24)
= -128.000 000 00 2
(32-24)
– (½)
24
= 127.999 999 94 (½)
24
= 0.000 000 06
The C code implementation of the linear equation is:
int32 Y, M, X, B; // numbers are all Q24
Y = ((int64) M * (int64) X + (int64) B << 24) >> 24;
Compared to the floating-point representation, it looks quite cumbersome and has little resem-
blance to the floating-point equation. It is obvious why programmers prefer using floating-point
math.
The slide shows the implementation of the equation on a processor containing hardware that can
perform a 32x32 bit multiplication, 64-bit addition and 64-bit shifts (logical and arithmetic) effi-
ciently.
The basic approach in traditional fixed-point "Q" math is to align the binary point of the operands
that get added to or subtracted from the multiplication result. As shown in the slide, the multipli-
cation of M and X (two Q24 numbers) results in a Q48 value that is stored in a 64-bit register.
The value B (Q24) needs to be scaled to a Q48 number before addition to the M*X value (low
order bits zero filled, high order bits sign extended). The final result is then scaled back to a Q24
number (arithmetic shift right) before storing into Y (Q24). Many programmers may be familiar
with 16-bit fixed-point "Q" math that is in common use. The same example using 16-bit numbers
with 15 fractional bits (Q15) would be coded as follows:
int16 Y, M, X, B; // numbers are all Q15
Y = ((int32) M * (int32) X + (int32) B << 15) >> 15;
In both cases, the principal methodology is the same. The binary point of the operands that get
added to or subtracted from the multiplication result must be aligned.
8 - 18 C28x - Numerical Concepts & IQmath