Cycle Timings and Interlock Behavior
ARM DDI 0388I Copyright © 2008-2012 ARM. All rights reserved. B-5
ID073015 Non-Confidential
The Cortex-A9 processor can load or store two 32-bit registers in each cycle. However, to access
64 bits, the address must be 64-bit aligned.
This scheduling is done in the Address Generation Unit (AGU). The number of cycles required
by the AGU to process the load multiple or store multiple operations depends on the length of
the register list and the 64-bit alignment of the address. The resulting latency is the latency of
the first loaded register. Table B-3 shows the cycle timings for load multiple operations.
Table B-3 Load multiple operations cycle timings
Instruction
AGU cycles to process the instruction Resulting latency
Address aligned on a 64-bit boundary
Fast forward case Other cases
Yes No
LDM
,{1 register} 1 1 2 3
LDM
,{2 registers}
LDRD
RFE
12 2 3
LDM
,{3 registers} 2 2 2 3
LDM
,{4 registers} 2 3 2 3
LDM
,{5 registers} 3 3 2 3
LDM
,{6 registers} 3 4 2 3
LDM
,{7 registers} 4 4 2 3
LDM
,{8 registers} 4 5 2 3
LDM
,{9 registers} 5 5 2 3
LDM
,{10 registers} 5 6 2 3
LDM
,{11 registers} 6 6 2 3
LDM
,{12 registers} 6 7 2 3
LDM
,{13 registers} 7 7 2 3
LDM
,{14 registers} 7 8 2 3
LDM
,{15 registers} 8 8 2 3
LDM
,{16 registers} 8 9 2 3