EasyManuals Logo

ARM ARM1176JZF-S User Manual

Default Icon
759 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #666 background imageLoading...
Page #666 background image
VFP Instruction Execution
ARM DDI 0301H Copyright © 2004-2009 ARM Limited. All rights reserved. 21-5
ID012310 Non-Confidential, Unrestricted Access
21.4 Forwarding
In general, any forwarding operation reduces the stall time of a dependent instruction by one
cycle. The VFP11 coprocessor forwards data from load instructions to CDP instructions and
from CDP instructions to CDP instructions.
The VFP11 coprocessor does not forward in the following cases:
from an instruction that produces integer data
to a store instruction, FST, FSTM, MRC, or MRRC
to an instruction of different precision.
In the examples that follow, the stall counts given are based on two data transfer assumptions:
accesses by load operations result in cache hits and are able to deliver one or two data
words per cycle
store operations write directly to the write buffer or cache and can transfer one or two data
words per cycle.
When these assumptions are valid, the VFP11 coprocessor operates at its highest performance.
When these assumptions are not valid, load and store operations are affected by the delay
required to access data. Example 21-1, Example 21-2 and Example 21-3 illustrate the
capabilities of the VFP11 coprocessor in ideal conditions.
In Example 21-1, the second FADDS instruction depends on the result of the first FADDS
instruction. The result of the first FADDS instruction is forwarded, reducing the stall from eight
cycles to seven cycles.
Example 21-1 Data forwarded to dependent instruction
FADDS S1, S2, S3
FADDS S8, S9, S1
In Example 21-2, there is no data forwarding of the double-precision FMULD data in D2 to the
single-precision FADDS data in S5, even though S5 is the upper half of D2.
Example 21-2 Mixed-precision data not forwarded
FMULD D2, D0, D1
FADDS S12, S13, S5
In Example 21-3, the double-precision FSTD stalls for eight cycles until the result of the
FMULD is written to the register file. No forwarding is done from the FMULD to the store
instruction.
Example 21-3 Data not forwarded to store instruction
FMULD D1, D2, D3
FSTD D1, [Rx]

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the ARM ARM1176JZF-S and is the answer not in the manual?

ARM ARM1176JZF-S Specifications

General IconGeneral
BrandARM
ModelARM1176JZF-S
CategoryComputer Hardware
LanguageEnglish

Related product manuals