Instruction Cycle Timings
ARM DDI 0210C Copyright © 2001, 2004 ARM Limited. All rights reserved. 6-29
6.20 Instruction speed summary
Because of the pipelined architecture of the CPU, instructions overlap considerably. In
a typical cycle, one instruction can be using the data path while the next is being
decoded and the one after that is being fetched. For this reason Table 6-23 presents the
incremental number of cycles required by an instruction, rather than the total number of
cycles for which the instruction uses part of the processor. Elapsed time, in cycles, for
a routine can be calculated from these figures listed in Table 6-23. These figures assume
that the instruction is actually executed. Unexecuted instructions take one cycle.
If the condition is not met then all instructions take one S-cycle. The cycle types N, S,
I, and C are described in Bus cycle types on page 3-4.
In Table 6-23:
• b is the number of cycles spent in the coprocessor busy-wait loop
•m is:
— 1 if bits [31:8] of the multiplier operand are all zero or one, else
— 2 if bits [31:16] of the multiplier operand are all zero or one, else
— 3 if bits [31:24] of the multiplier operand are all zero or all one, else
—4.
• n is the number of words transferred.
Table 6-23 ARM instruction speed summary
Instruction Cycle count Additional
Data Processing S +I for SHIFT(Rs)
+S + N if R15 written
MSR, MRS S -
LDR S+N+I +S +N if R15 loaded
STR 2N -
LDM nS+N+I +S +N if R15 loaded
STM (n-1)S+2N -
SWP S+2N+I -
B,BL 2S+N -
SWI, trap 2S+N -
MUL S+mI -