Memory hierarchy: Difference between revisions

Content deleted Content added
Examples: missing {{As of}} × 2
GreenC bot (talk | contribs)
Rescued 1 archive link. Wayback Medic 2.5 per WP:URLREQ#anandtech.com
 
(7 intermediate revisions by one other user not shown)
Line 52:
* [[Processor register]]s{{dash}}the fastest possible access (usually 1 CPU cycle). A few thousand bytes in size.
* [[CPU cache|Cache]]
** Level 0 (L0), [[micro-operation]]s cache{{dash}}6,144 bytes (6 KiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}})<ref>{{cite web|url=http://www.anandtech.com/show/6355/intels-haswell-architecture/6 |archive-url=https://web.archive.org/web/20121007163104/http://www.anandtech.com/show/6355/intels-haswell-architecture/6 |url-status=dead |archive-date=October 7, 2012 |title=Intel's Haswell Architecture Analyzed: Building a New PC and a New Intel |publisher=AnandTech |access-date=2014-07-31}}</ref> in size
** Level 1 (L1) [[Opcode|instruction]] cache{{dash}}128 KiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size
** Level 1 (L1) data cache{{dash}}128 KiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 700 [[Gigabyte|GB]]/s.<ref name=sisd_qa_f_mem_hsw>{{cite web|url=http://www.sisoftware.co.uk/?d=qa&f=mem_hsw |title=SiSoftware Zone |publisher=Sisoftware.co.uk |access-date=2014-07-31|archive-url=https://web.archive.org/web/20140913231938/http://www.sisoftware.co.uk/?d=qa&f=mem_hsw|archive-date=2014-09-13}}</ref>
** Level 2 (L2) instruction and data (shared){{dash}}1 [[MiB]]{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 200 GB/s.<ref name=sisd_qa_f_mem_hsw />
** Level 3 (L3) shared cache{{dash}}6 MiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 100 GB/s.<ref name=sisd_qa_f_mem_hsw />
** Level 4 (L4) shared cache{{dash}}128 MiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 40 GB/s.<ref name=sisd_qa_f_mem_hsw />
* [[Computer memory|Main memory]] ([[primary storage]]){{dash}}[[GiB]]{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 10 GB/s.<ref name=sisd_qa_f_mem_hsw /> In the case of a [[Non-Uniform Memory Access|NUMA]] machine, access times may not be uniform.
* [[DiskMass storage]] ([[secondary storage]]){{dash}}[[terabyte]]s in size. {{As of|2017}}, best access speed is from a consumer [[Solid-state drive|solid state drive]] is about 2000 MB/s.<ref>{{cite web|url=http://www.storagereview.com/samsung_960_pro_m2_nvme_ssd_review|title=Samsung 960 Pro M.2 NVMe SSD Review|date=20 October 2016 |publisher=storagereview.com|access-date=2017-04-13}}</ref>
* [[Nearline storage]] ([[tertiary storage]]){{dash}}up to [[exabytes]] in size. {{As of|2013}}, best access speed is about 160 MB/s.<ref>{{cite web |url=http://www.lto.org/technology/generations.html |title=Ultrium – LTO Technology – Ultrium GenerationsLTO |publisher=Lto.org |access-date=2014-07-31 |url-status=dead |archive-url=https://web.archive.org/web/20110727052050/http://www.lto.org/technology/generations.html |archive-date=2011-07-27 }}</ref>
* [[Offline storage]]
 
The lower levels of the hierarchy{{dash}}from mass from disksstorage downwards{{dash}}are also known as [[tiered storage]]. The formal distinction between online, nearline, and offline storage is:<ref name="pearson2010">{{cite web|last=Pearson|first=Tony|year=2010|title=Correct use of the term Nearline.|url=https://www.ibm.com/developerworks/community/blogs/InsideSystemStorage/entry/the_correct_use_of_the_term_nearline2|url-status=dead|archive-url=https://web.archive.org/web/20181127020712/https://www.ibm.com/developerworks/community/blogs/InsideSystemStorage/entry/the_correct_use_of_the_term_nearline2?lang=en|archive-date=2018-11-27|access-date=2015-08-16|work=IBM Developerworks, Inside System Storage}}</ref>
* Online storage is immediately available for I/O.
* Nearline storage is not immediately available, but can be made online quickly without human intervention.
* Offline storage is not immediately available, and requires some human intervention to bring online.
 
For example, always-on spinning disks are online, while spinning disks that spin- down, such as massive arrayarrays of idle disk ([[Non-RAID drive architectures#MAID|MAID]]), are nearline. Removable media such as tape cartridges that can be automatically loaded, as in a [[tape library]], are nearline, while cartridges that must be manually loaded are offline.
 
Most modern [[Central processing unit|CPUs]] are so fast that, for most program workloads, the [[wikt:bottleneck|bottleneck]] is the [[locality of reference]] of memory accesses and the efficiency of the [[CPU cache|caching]] and memory transfer between different levels of the hierarchy{{Citation needed|date=September 2009}}. As a result, the CPU spends much of its time idling, waiting for memory I/O to complete. This is sometimes called the ''space cost'', as a larger memory object is more likely to overflow a small and fast level and require use of a larger, slower level. The resulting load on memory use is known as ''pressure'' (respectively ''register pressure'', ''cache pressure'', and (main) ''memory pressure''). Terms for data being missing from a higher level and needing to be fetched from a lower level are, respectively: [[register spilling]] (due to [[register pressure]]: register to cache), [[cache miss]] (cache to main memory), and (hard) [[page fault]] (''real'' main memory to ''virtual'' memory, i.e. mass storage, commonly referred to as ''disk'' regardless of the actual mass storage technology used).
 
Modern [[programming language]]s mainly assume two levels of memory, main (''working'') memory and diskmass storage, though in [[assembly language]] and [[inline assembler]]s in languages such as [[C (programming language)|C]], registers can be directly accessed. Taking optimal advantage of the memory hierarchy requires the cooperation of programmers, hardware, and compilers (as well as underlying support from the operating system):
*''Programmers'' are responsible for moving data between disk and memory through file I/O.
*''Hardware'' is responsible for moving data between memory and caches.