Memory hierarchy: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 23:13, 8 March 2025 edit Stephan Leeds (talk \| contribs) Extended confirmed users, IP block exemptions 35,964 edits →Examples: missing {{As of}} × 2 ← Previous edit		Latest revision as of 05:47, 6 August 2025 edit undo GreenC bot (talk \| contribs) Bots 3,054,792 edits Rescued 1 archive link. Wayback Medic 2.5 per WP:URLREQ#anandtech.com
(7 intermediate revisions by one other user not shown)
Line 52: * [[Processor register]]s{{dash}}the fastest possible access (usually 1 CPU cycle). A few thousand bytes in size. * [[CPU cache\|Cache]] Level 0 (L0), [[micro-operation]]s cache{{dash}}6,144 bytes (6 KiB{{cn\|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc\|date=May 2021}}{{Original research inline\|certain=y\|date=May 2021}})<ref>{{cite web\|url=http://www.anandtech.com/show/6355/intels-haswell-architecture/6 \|archive-url=https://web.archive.org/web/20121007163104/http://www.anandtech.com/show/6355/intels-haswell-architecture/6 \|url-status=dead \|archive-date=October 7, 2012 \|title=Intel's Haswell Architecture Analyzed: Building a New PC and a New Intel \|publisher=AnandTech \|access-date=2014-07-31}}</ref> in size Level 1 (L1) [[Opcode\|instruction]] cache{{dash}}128 KiB{{cn\|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc\|date=May 2021}}{{Original research inline\|certain=y\|date=May 2021}} in size Level 1 (L1) data cache{{dash}}128 KiB{{cn\|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc\|date=May 2021}}{{Original research inline\|certain=y\|date=May 2021}} in size. Best access speed is around 700 [[Gigabyte\|GB]]/s.<ref name=sisd_qa_f_mem_hsw>{{cite web\|url=http://www.sisoftware.co.uk/?d=qa&f=mem_hsw \|title=SiSoftware Zone \|publisher=Sisoftware.co.uk \|access-date=2014-07-31\|archive-url=https://web.archive.org/web/20140913231938/http://www.sisoftware.co.uk/?d=qa&f=mem_hsw\|archive-date=2014-09-13}}</ref> Level 2 (L2) instruction and data (shared){{dash}}1 [[MiB]]{{cn\|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc\|date=May 2021}}{{Original research inline\|certain=y\|date=May 2021}} in size. Best access speed is around 200 GB/s.<ref name=sisd_qa_f_mem_hsw /> Level 3 (L3) shared cache{{dash}}6 MiB{{cn\|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc\|date=May 2021}}{{Original research inline\|certain=y\|date=May 2021}} in size. Best access speed is around 100 GB/s.<ref name=sisd_qa_f_mem_hsw /> Level 4 (L4) shared cache{{dash}}128 MiB{{cn\|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc\|date=May 2021}}{{Original research inline\|certain=y\|date=May 2021}} in size. Best access speed is around 40 GB/s.<ref name=sisd_qa_f_mem_hsw /> * [[Computer memory\|Main memory]] ([[primary storage]]){{dash}}[[GiB]]{{cn\|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc\|date=May 2021}}{{Original research inline\|certain=y\|date=May 2021}} in size. Best access speed is around 10 GB/s.<ref name=sisd_qa_f_mem_hsw /> In the case of a [[Non-Uniform Memory Access\|NUMA]] machine, access times may not be uniform. * [[~~Disk~~Mass storage]] ([[secondary storage]]){{dash}}[[terabyte]]s in size. {{As of\|2017}}, best access speed is from a consumer [[Solid-state drive\|solid state drive]] is about 2000 MB/s.<ref>{{cite web\|url=http://www.storagereview.com/samsung_960_pro_m2_nvme_ssd_review\|title=Samsung 960 Pro M.2 NVMe SSD Review\|date=20 October 2016 \|publisher=storagereview.com\|access-date=2017-04-13}}</ref> * [[Nearline storage]] ([[tertiary storage]]){{dash}}up to [[exabytes]] in size. {{As of\|2013}}, best access speed is about 160 MB/s.<ref>{{cite web \|url=http://www.lto.org/technology/generations.html \|title=Ultrium – LTO Technology – Ultrium GenerationsLTO \|publisher=Lto.org \|access-date=2014-07-31 \|url-status=dead \|archive-url=https://web.archive.org/web/20110727052050/http://www.lto.org/technology/generations.html \|archive-date=2011-07-27 }}</ref> * [[Offline storage]] The lower levels of the hierarchy{{dash}}from –mass ~~from disks~~storage downwards – {{dash}}are also known as [[tiered storage]]. The formal distinction between online, nearline, and offline storage is:<ref name="pearson2010">{{cite web\|last=Pearson\|first=Tony\|year=2010\|title=Correct use of the term Nearline.\|url=https://www.ibm.com/developerworks/community/blogs/InsideSystemStorage/entry/the_correct_use_of_the_term_nearline2\|url-status=dead\|archive-url=https://web.archive.org/web/20181127020712/https://www.ibm.com/developerworks/community/blogs/InsideSystemStorage/entry/the_correct_use_of_the_term_nearline2?lang=en\|archive-date=2018-11-27\|access-date=2015-08-16\|work=IBM Developerworks, Inside System Storage}}</ref> * Online storage is immediately available for I/O. * Nearline storage is not immediately available, but can be made online quickly without human intervention. * Offline storage is not immediately available, and requires some human intervention to bring online. For example, always-on spinning disks are online, while spinning disks that spin- down, such as massive ~~array~~arrays of idle disk ([[Non-RAID drive architectures#MAID\|MAID]]), are nearline. Removable media such as tape cartridges that can be automatically loaded, as in a [[tape library]], are nearline, while cartridges that must be manually loaded are offline. Most modern [[Central processing unit\|CPUs]] are so fast that, for most program workloads, the [[wikt:bottleneck\|bottleneck]] is the [[locality of reference]] of memory accesses and the efficiency of the [[CPU cache\|caching]] and memory transfer between different levels of the hierarchy{{Citation needed\|date=September 2009}}. As a result, the CPU spends much of its time idling, waiting for memory I/O to complete. This is sometimes called the ''space cost'', as a larger memory object is more likely to overflow a small and fast level and require use of a larger, slower level. The resulting load on memory use is known as ''pressure'' (respectively ''register pressure'', ''cache pressure'', and (main) ''memory pressure''). Terms for data being missing from a higher level and needing to be fetched from a lower level are, respectively: [[register spilling]] (due to [[register pressure]]: register to cache), [[cache miss]] (cache to main memory), and (hard) [[page fault]] (''real'' main memory to ''virtual'' memory, i.e. mass storage, commonly referred to as ''disk'' regardless of the actual mass storage technology used). Modern [[programming language]]s mainly assume two levels of memory, main (''working'') memory and ~~disk~~mass storage, though in [[assembly language]] and [[inline assembler]]s in languages such as [[C (programming language)\|C]], registers can be directly accessed. Taking optimal advantage of the memory hierarchy requires the cooperation of programmers, hardware, and compilers (as well as underlying support from the operating system): ''Programmers'' are responsible for moving data between disk and memory through file I/O. ''Hardware'' is responsible for moving data between memory and caches.