ARM Cortex-A35 - A6.5 about Data Prefetching

To Next Page

To Previous Page

A6.5 About data prefetching

This section describes the software and hardware data prefetching behavior for the processor.

Preload instructions

PLD instructions in AArch32, and PRFM instructions of type PLD in AArch64, look up in the cache and

start a linefill if they miss and are to a cacheable address. These instructions retire as soon as their linefill

has started, they do not wait for data to be returned. This enables other instructions to execute while the

linefill continues in the background.

PLDW instructions in AArch32, and PRFM instructions of type PST in AArch64, are similar to PLD, except

that if they miss, the linefill causes data to be invalidated in other cores and masters so that the line is

ready for writing.

PRFM instructions also enable targeting of a prefetch to the L2 cache. When this is the case, a request is

sent to the L2 memory system to start a linefill. The instruction then retires without any data being

returned to the L1 memory system.

PLI instructions in AArch32, and PRFM instructions of type PLI in AArch64, are treated as NOPs.

Automatic data prefetching and monitoring

The L1 data-side memory system implements an automatic prefetcher that monitors cache misses in the

core. When a pattern is detected, the automatic prefetcher starts linefills in the background. The

prefetcher recognizes a sequence of data cache misses at a fixed stride pattern that lies in four cache

lines, plus or minus. Any intervening stores or loads that hit in the data cache do not interfere with the

recognition of the cache miss pattern.

The CPUACTLR enables you to:

• Deactivate the prefetcher.

• Alter the sequence length required to trigger the prefetcher.

• Alter the number of outstanding requests that the prefetcher can make.

Use PLD or PRFM instructions for data prefetching where short sequences or irregular pattern fetches are

required.

Non-temporal loads

Cache requests made by a non-temporal load instruction (LDNP) are allocated to the L2 cache only. The

allocation policy makes it likely that the line is replaced sooner than other lines.

Data Cache Zero

The Data Cache Zero by Virtual Address (DC ZVA) instruction enables a block of 64 bytes in memory,

aligned to 64 bytes in size, to be set to zero. If the DC ZVA instruction misses in the cache, it clears main

memory, without causing an L1 or L2 cache allocation.

Related information

B2.36 CPU Auxiliary Control Register, EL1 on page B2-412

A6 L1 Memory System

A6.5 About data prefetching

reserved.

A6-95

Non-Confidential

Related product manuals