Content deleted Content added
decimal (1000) kB -> binary (1024) KB. Other (more) minor changes. Missing microop/L0 cache info... |
→Recent implementation models: copyediting. |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 53:
In a banked cache, the cache is divided into a cache dedicated to [[machine code|instruction]] storage and a cache dedicated to data. In contrast, a unified cache contains both the instructions and data in the same cache.<ref>Yan Solihin, 2015. Fundamentals of Parallel Multicore Architecture. CRC Press. p. 150. {{ISBN|978-1-4822-1119-1}}.</ref> During a process, the L1 cache (or most upper-level cache in relation to its connection to the processor) is accessed by the processor to retrieve both instructions and data. Requiring both actions to be implemented at the same time requires multiple ports and more access time in a unified cache. Having multiple ports requires additional hardware and wiring, leading to a significant structure between the caches and processing units.<ref>Steve Heath, 2002. Embedded Systems Design. Elsevier. p. 106. {{ISBN|978-0-08-047756-5}}.</ref> To avoid this, the L1 cache is often organized as a banked cache which results in fewer ports, less hardware, and generally lower access times.<ref name=":1" />
Modern processors have split caches, and in systems with multilevel caches
=== Inclusion policies ===
Line 94:
=== Intel i5 Raptor Lake-HX (2024) ===
6-core (performance | efficiency):
* L1 cache – 128 {{abbr|KB|kilobytes}} per core
* L2 cache – 2 {{abbr|MB|megabytes}} per core | 4–8 {{abbr|MB|megabytes}} semi-shared
* L3 cache – 20–24 {{abbr|MB|megabytes}} shared
=== AMD EPYC 9684X (Zen 4, 2023) ===
96-core:
* L1 cache – 64 {{abbr|KB|kilobytes}} per core
Line 111:
* L3 cache – 96 {{abbr|MB|megabytes}} shared
=== AMD
6- to 16-core:
* L1 cache – 64 {{abbr|KB|kilobytes}} per core
Line 117:
* L3 cache – 32 to 128 {{abbr|MB|megabytes}} shared
=== AMD Zen 2
* L1 cache – 32 KB data & 32 KB instruction per core, 8-way
* L2 cache – 512 KB per core, 8-way inclusive
* L3 cache – 16 MB local per 4-core CCX, 2 CCXs per chiplet, 16-way non-inclusive. Up to 64 MB on desktop CPUs and 256 MB on server CPUs
=== AMD Zen
* L1 cache – 32 KB data & 64 KB instruction per core, 4-way
* L2 cache – 512 KB per core, 4-way inclusive
* L3 cache – 4 MB local & remote per 4-core CCX, 2 CCXs per chiplet, 16-way non-inclusive. Up to 16 MB on desktop CPUs and 64 MB on server CPUs
=== Intel Kaby Lake
* L1 cache (instruction and data) – 64 KB per core
* L2 cache – 256 KB per core
* L3 cache – 2 MB to 8 MB shared<ref name=":3">{{Cite web|url=https://ark.intel.com/|title=Intel Kaby Lake Microrchitecture}}</ref>
=== Intel Broadwell
* L1 cache (instruction and data) – 64 {{abbr|KB|kilobytes}} per core
* L2 cache – 256 KK per core
Line 139:
=== IBM POWER7 (2010) ===
* L1 cache (instruction and data) – each 64-banked, each bank has
* L2 cache – 256 KB, 8-way, 128B block, write back, inclusive of L1, 2 ns access latency
* L3 cache – 8 regions of 4 MB (total 32 MB), local region 6 ns, remote 30 ns, each region 8-way associative, DRAM data array, SRAM tag array<ref>{{Cite web|url=https://www-03.ibm.com/systems/power/hardware/795/specs.html|archive-url=https://web.archive.org/web/20100821102938/http://www-03.ibm.com/systems/power/hardware/795/specs.html|url-status=dead|archive-date=August 21, 2010|title=IBM Power7}}</ref>
== See also ==
* CPU microarchitectures mentioned in this article:
** [[POWER7]]
** [[Broadwell (microarchitecture)|Intel Broadwell
** [[Zen (microarchitecture)|AMD Zen]]
** [[Apple silicon|Apple Silicon]]
* [[CPU cache]]
* [[Memory hierarchy]]
* [[CAS latency|CAS latency (RAM)]]
* [[Cache (computing)]]
|