ST SPC560P34 - Microarchitecture Summary

To Next Page

To Previous Page

RM0046 e200z0 and e200z0h Core

Doc ID 16912 Rev 5 271/936

● Testability

– Synthesizeable, full MuxD scan design

– ABIST/MBIST for optional memory arrays

12.2.1 Microarchitecture summary

The e200z0 processor utilizes a four stage pipeline for instruction execution. The Instruction

Fetch (stage 1), Instruction Decode/Register file Read/Effective Address Calculation (stage

2), Execute/Memory Access (stage 3), and Register Writeback (stage 4) stages operate in

an overlapped fashion, allowing single clock instruction execution for most instructions.

The integer execution unit consists of a 32-bit Arithmetic Unit (AU), a Logic Unit (LU), a 32-

bit Barrel shifter (Shifter), a Mask-Insertion Unit (MIU), a Condition Register manipulation

Unit (CRU), a Count-Leading-Zeros unit (CLZ), an 8 × 32 Hardware Multiplier array, result

feed-forward hardware, and a hardware divider.

Arithmetic and logical operations are executed in a single cycle with the exception of the

divide and multiply instructions. A Count-Leading-Zeros unit operates in a single clock cycle.

The Instruction Unit contains a PC incrementer and a dedicated Branch Address adder to

minimize delays during change of flow operations. Sequential prefetching is performed to

ensure a supply of instructions into the execution pipeline. Branch target prefetching from

the BTB is performed to accelerate certain taken branches in the e200z0. Prefetched

instructions are placed into an instruction buffer with 4 entries (2 entries in e200z0), each

capable of holding a single 32-bit instruction or a pair of 16-bit instructions.

Conditional branches that are not taken execute in a single clock. Branches with successful

target prefetching have an effective execution time of one clock on e200z0h. All other taken

branches have an execution time of two clocks.

Memory load and store operations are provided for byte, halfword, and word (32-bit) data

with automatic zero or sign extension of byte and halfword load data as well as optional byte

reversal of data. These instructions can be pipelined to allow effective single cycle

throughput. Load and store multiple word instructions allow low overhead context save and

restore operations. The load/store unit contains a dedicated effective address adder to allow

effective address generation to be optimized. Also, a load-to-use dependency does not incur

any pipeline bubbles for most cases.

The Condition Register unit supports the condition register (CR) and condition register

operations defined by the Power Architecture architecture. The condition register consists of

eight 4-bit fields that reflect the results of certain operations, such as move, integer and

floating-point compare, arithmetic, and logical instructions, and provide a mechanism for

testing and branching.

Vectored and autovectored interrupts are supported by the CPU. Vectored interrupt support

is provided to allow multiple interrupt sources to have unique interrupt handlers invoked with

no software overhead.

Related product manuals