Memory hierarchy

This is an old revision of this page, as edited by Intgr (talk | contribs) at 15:29, 8 September 2006. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The hierarchical arrangement of storage in current computer architectures is called the memory hierarchy. It is designed to take advantage of memory locality in computer programs. Each level of the hierarchy is of higher speed and lower latency, and is of smaller size, than lower levels.

Most modern CPUs are so fast that for most program workloads the locality of reference of memory accesses, and the efficiency of the caching and memory transfer between different levels of the hierarchy, is the practical limitation on processing speed. As a result, the CPU spends much of its time idling, waiting for memory I/O to complete.

The memory hierarchy in most computers is as follows:

  • Processor registers – fastest possible access (usually 1 CPU cycle), only hundreds of bytes in size
  • Level 1 (L1) cache – often accessed in just a few cycles, usually tens of kilobytes
  • Level 2 (L2) cache – higher latency than L1 by 2× to 10×, often 512 KiB or more
  • Level 3 (L3) cache – (optional) higher latency than L2, often several MiB
  • Main memory (DRAM) – may take hundreds of cycles, but can be multiple gigabytes. Access times may not be uniform, in the case of a NUMA machine.
  • Disk storage – hundreds of thousands of cycles latency, but very large
  • Tertiary storage – tape, optical disk (WORM)

Management

Modern programming languages mainly assume two levels of memory, main memory and disk storage, though directly accessing registers is allowable in rare cases. Programmers are responsible for moving data between disk and memory through file I/O. Hardware is in charge of moving data between memory and caches. Compilers are trying to optimize the usage of caches and registers.

See also