ARM AMBA NIC-301 Bedienungsanleitung PDF herunterladen (Seite 10)

Understanding Vybrid Architecture, Application Note, Rev. 0, 07/2014

10 Freescale Semiconductor, Inc.

Architectural key points

to one-way (L2C-310 Technical Reference Manual). Vybrid L2 is configured as eight-way

associative.

3.10 Cache miss and caches cooperation

Not all data can be stored in the cache. If data are not in the cache, data have to be fetched from source

memory — this is called cache miss.

For Cortex-M4 if L1 cache miss occurs, one clock cycle latency is taken and then we have to go through

NIC to given memory. This causes additional latency, described in the “NIC Transfer Latency” chapter of

Cortex-A Series Programmer’s Guide [6]. When L1 cache misses occur, then 32B are loaded in burst mode

so following opcodes are delayed less.

For Cortex-A5 if L1 cache miss occurs, one clock cycle latency is taken and then we proceed to L2 cache

if enabled, or we have to go through NIC to given memory. In the case that L2 cache is enabled, cache

access is done in two phases:

1. During the first phase of L2 access, the tags are accessed in parallel and hit/miss is determined.

The tag arrays are built in the L2 cache controller. No wait states are necessary.

2. During the second phase of a L2 access, a single data array structure is accessed:

a) L2 read cache hit: the L2 data array provides one line of data to the L2 cache controller at the

same time. For Vybrid, it takes three clock cycles to read or write the L2 data array structure

(upper GRAM on platform frequency). Together, four clocks Cortex-A5 latency.

b) L1 and L2 miss: the L2 adds two clock cycles on the miss request and two cycles on the miss

return. This four-cycle penalty is present even if the L2 is not enabled. Moreover, data have to

be fetched from source memory. Together latency is 4 + 3x (NIC latency + source memory)

Cortex-A5 clocks.

One important element to consider when dealing with a cache miss is the miss rate. A cache miss rate

depends on the application, especially on features like branch prediction, micro TLB hit/miss,

instruction/data pipeline interaction and code character repeatable/linear. Some raw data for estimation

indicates:

• Real data from different test ARM mode 4%, THUMB 2%

• Designer’s estimate of worst case and linear code 20%, normally 2%

For more information about cache function, see Cortex-A Series Programmer’s Guide [6], chapter

“Caches.”

3.11 Selecting the right memory

Vybrid integrates several memories which can be used by Cortex-A5 or Cortex-M4:

• SRAM 2x256 kB (Cortex-A5, Cortex-M4)

• GRAM 1 MB (Cortex-A5, Cortex-M4)

— L2 cache shares 512 KB from GRAM (Cortex-A5)

• TCM (TCL 32 kB code bus, TCU 32 kB system bus), (Cortex-A5 by back door, Cortex-M4

directly)

1 2 ... 5 6 7 8 9 10 11 12 13 14 15 ... 23 24

Keine Kommentare

ARM AMBA NIC-301 Bedienungsanleitung Seite 10