Multiply-accumulate unit (MAC) UM0404
86/564 DocID13284 Rev 2
the post-modification of IDX
i
. It is obtained by the reverse operation than the one used to
calculate the new value of IDX
i
. The following table shows these rules.
The Parallel Data Move shifts a table of operands in parallel with a computation on those
operands. Its specific use is for signal processing algorithms like filter computation. The
following figure gives an example of Parallel Data Move with CoMACM instruction.
Figure 17. Example of parallel data move
4.2.4 16 x 16 signed/unsigned parallel multiplier
The multiplier executes 16 x 16-bit parallel signed/unsigned fractional and integer multiplies.
The multiplier has two 16-bit input ports, and a 32-bit product output port. The input ports
can accept data from the MA-bus and from the MB-bus. The output is sign-extended and
then feeds a scaler that shifts the multiplier output according to the shift mode bit MP
specified in the co-processor Control Word (MCW). The product can be shifted one bit left to
compensate for the extra sign bit gained in multiplying two 16-bit signed (2’s complement)
fractional numbers if bit MP is set.
4.2.5 40-bit signed arithmetic unit
The arithmetic unit over 32-bit wide to allow intermediate overflow in a series of
multiply/accumulate operations. The extension flag E, contained in the most significant byte
of MSW, is set when the Accumulator has overflowed beyond the 32-bit boundary, that is,
when there are significant (non-sign) bits in the top eight (signed arithmetic) bits of the
Accumulator.
The 40-bit arithmetic unit has two 40-bit input ports A and B. The A-input port accepts data
from four possible sources: 00’0000’0000h, 00’0000’8000h (round), the sign-extended
product, or the sign-extended data conveyed by the 32-bit bus resulting from the
Table 11. Parallel data move addressing
Instruction Writeback address
CoMACM [IDX
i
+],... <IDX
i
-2>
CoMACM [IDX
i
-],... <IDX
i
+2>
CoMACM [IDX
i
+QX
j
],... <IDX
i
-QX
j
>
CoMACM [IDX
i
-QX
j
],... <IDX
i
+QX
j
>
CoMACM [IDX0+], [R2+]
X
n+2
n
n-2
n-4
16-bit
IDX0 X
X
n+2
n
n-2
n-4
IDX0
Parallel Data Move
After ExecutionBefore Execution