Embedded Flash memory (FLASH) for category 3 devices RM0440
98/2126 RM0440 Rev 4
RCC_CFGR register and then (if needed) modify the CPU clock prescaler by writing the
HPRE bits in RCC_CFGR.
4. Check that the new CPU clock source or/and the new CPU clock prescaler value is/are
taken into account by reading the clock source status (SWS bits) or/and the AHB
prescaler value (HPRE bits), respectively, in the RCC_CFGR register.
5. Program the new number of wait states to the LATENCY bits in Flash access control
register (FLASH_ACR).
6. Check that the new number of wait states is used to access the Flash memory by
reading the FLASH_ACR register.
3.3.4 Adaptive real-time memory accelerator (ART Accelerator)
The proprietary Adaptive real-time (ART) memory accelerator is optimized for STM32
industry-standard Arm
®
Cortex
®
-M4 with FPU processors. It balances the inherent
performance advantage of the Arm
®
Cortex
®
-M4 with FPU over Flash memory
technologies, which normally requires the processor to wait for the Flash memory at higher
operating frequencies.
To release the processor full performance, the accelerator implements an instruction
prefetch queue and branch cache which increases program execution speed from the 64-
bit Flash memory. Based on CoreMark benchmark, the performance achieved thanks to the
ART accelerator is equivalent to 0 wait state program execution from Flash memory at a
CPU frequency up to 170 MHz.
Instruction prefetch
The Cortex
®
-M4 fetches the instruction over the ICode bus and the literal pool
(constant/data) over the DCode bus. The prefetch block aims at increasing the efficiency of
ICode bus accesses.
In case of Single bank mode (DBANK option bit is reset), each Flash memory read
operation provides 128 bits from either four instructions of 32 bits or eight instructions of
16 bits depending on the launched program. This 128-bits current instruction line is saved in
a current buffer, and in case of sequential code, at least four CPU cycles are needed to
execute the previous read instruction line.
When in dual bank mode (DBANK option bit is set), each Flash memory read operation
provides 64 bits from either two instructions of 32 bits or four instructions of 16 bits
depending on the launched program. This 64-bits current instruction line is saved in a
current buffer, and in case of sequential code, at least two CPU cycles are needed to
execute the previous read instruction line.
Prefetch on the ICode bus can be used to read the next sequential instruction line from the
Flash memory while the current instruction line is being requested by the CPU.
Prefetch is enabled by setting the PRFTEN bit in the Flash access control register
(FLASH_ACR). This feature is useful if at least one wait state is needed to access the Flash
memory.
Figure 9 shows the execution of sequential 16-bit instructions with and without prefetch
when 3 WS are needed to access the Flash memory.