Virtual memory compression: Difference between revisions

Content deleted Content added
Origins: Added the compressed executable scheme implemented by RISC iX.
OAbot (talk | contribs)
m Open access bot: url-access=subscription updated in citation with #oabot.
 
(29 intermediate revisions by 13 users not shown)
Line 1:
{{Short description|Algorithms for compressing in-memory data}}
{{Use dmy dates|date=May 2019|cs1-dates=y}}
{{Use list-defined references|date=December 2021}}
Line 7 ⟶ 8:
Virtual memory compression is distinct from [[garbage collection (computer science)|garbage collection]] (GC) systems, which remove unused memory blocks and in some cases consolidate used memory regions, reducing fragmentation and improving efficiency. Virtual memory compression is also distinct from [[context switching]] systems, such as [[Connectix]]'s [[RAM Doubler]] (though it also did online compression) and Apple OS 7.1, in which inactive processes are suspended and then compressed as a whole.<ref name="CWORLD-RD2"/>
 
==BenefitsTypes==
{{More citations needed|section|date=May 2019}}
 
There are two general types of virtual memory compression : (1) sending compressed pages to a swap file in main memory, possibly with a backing store in auxiliary storage,<ref name ="CaseForCompressedCaching"/><ref name="zram_kernel_org">{{cite web |url=https://www.kernel.org/doc/html/next/admin-guide/blockdev/zram.html |title="zram: Compressed RAM-based block devices" |last="Gupta" |first="Nitin" |website=docs.kernel.org |publisher="The kernel development community" |access-date=2023-12-29 }}</ref><ref name="zswap_kernel_org">{{cite web |url=https://www.kernel.org/doc/html/v4.18/vm/zswap.html |title="zswap" |website=www.kernel.org |publisher="The kernel development community" |access-date=2023-12-29 }}</ref> and (2) storing compressed pages side-by-side with uncompressed pages.<ref name="CaseForCompressedCaching"/>
 
The first type (1) usually uses some sort of [[LZ77_and_LZ78|LZ]] class dictionary compression algorithm combined with [[entropy coding]], such as [[Lempel–Ziv–Oberhumer|LZO]] or [[LZ4_(compression_algorithm)|LZ4]],<ref name="zswap_kernel_org" /><ref name="zram_kernel_org" /> to compress the pages being swapped out. Once compressed, they are either stored in a swap file in main memory, or written to auxiliary storage, such as a hard disk.<ref name="zswap_kernel_org" /><ref name="zram_kernel_org" /> A two stage process can be used instead wherein there exists both a backing store in auxiliary storage and a swap file in main memory and pages that are evicted from the in-memory swap file are written to the backing store with a much increased write bandwidth (eg. pages/sec) so that writing to the backing store takes less time. This last scheme leverages the benefits of both previous methods : fast in-memory data access with a large increase in the total amount of data that can be swapped out and an increased bandwidth in writing pages (pages/sec) to auxiliary storage.<ref name="zswap_kernel_org" /><ref name="zram_kernel_org" /><ref name ="CaseForCompressedCaching"/>
 
One example of a class of algorithms for type (2) virtual memory compression is the WK (Wilson-Kaplan et. al) class of compression algorithms. These take advantage of in-memory data regularities present in pointers and integers.<ref name ="CaseForCompressedCaching"/><ref name="SIMPSON" /> Specifically, in (the data segment -- the WK algorithms are not suitable for instruction compression<ref name ="CaseForCompressedCaching"/>) target code generated by most high-level programming languages, both integers and pointers are often present in records whose elements are word-aligned. Furthermore, the values stored in integers are usually small. Also pointers close to each other in memory tend to point to locations that are themselves nearby in memory. Additionally, common data patterns such as a word of all zeroes can be encoded in the compressed output by a very small code (two bits in the case of WKdm). Using these data regularities, the WK class of algorithms use a very small dictionary ( 16 entries in the case of [[WKdm]] ) to achieve up to a 2:1 compression ratio while achieving much greater speeds and having less overhead than LZ class dictionary compression schemes.<ref name ="CaseForCompressedCaching"/><ref name="SIMPSON" />
 
==Benefits==
By reducing the I/O activity caused by paging requests, virtual memory compression can produce overall performance improvements. The degree of performance improvement depends on a variety of factors, including the availability of any compression co-processors, spare bandwidth on the CPU, speed of the I/O channel, speed of the physical memory, and the compressibility of the physical memory contents.
 
Line 15 ⟶ 22:
 
In some situations, such as in [[embedded device]]s, auxiliary storage is limited or non-existent. In these cases, virtual memory compression can allow a virtual memory system to operate, where otherwise virtual memory would have to be disabled. This allows the system to run certain software which would otherwise be unable to operate in an environment with no virtual memory.<ref name="Paul_1997_NWDOSTIP"/>
 
[[Flash memory]] has certain endurance limitations on the maximum number of erase cycles it can undergo, which can be as low as 100 erase cycles. In systems where Flash Memory is used as the only auxiliary storage system, implementing virtual memory compression can reduce the total quantity of data being written to auxiliary storage, improving system reliability.
 
==Shortcomings==
Line 34 ⟶ 39:
For example, in order to maximize the use of a compressed pages cache, [[Helix Software Company]]'s Hurricane&nbsp;2.0 provides a user-configurable compression rejection threshold. By compressing the first 256 to 512 bytes of a 4&nbsp;KiB page, this virtual memory compression system determines whether the configured compression level threshold can be achieved for a particular page; if achievable, the rest of the page would be compressed and retained in a compressed cache, and otherwise the page would be sent to auxiliary storage through the normal paging system. The default setting for this threshold is an 8:1 compression ratio.<ref name="PCMAG-HURR-2"/><ref name="CWORLD-RD2"/>
 
===CPU utilization overhead===
===Price/performance issues===
In hardware implementations, the technology also relies on price differentials between the various components of the system, for example, the difference between the cost of RAM and the cost of a processor dedicated to compression. The relative price/-to-performance differences of the various components tend to vary over time. For example, the addition of a compression co-processor may have minimal impact on the cost of a CPU.
 
===Prioritization===
In a typical virtual memory implementation, paging happens on a [[least recently used]] basis, potentially causing the compression algorithm to use up CPU cycles dealing with the lowest priority data. Furthermore, program code is usually read-only, and is therefore never paged-out. Instead code is simply discarded, and re-loaded from the program’s auxiliary storage file if needed. In this case the bar for compression is higher, since the I/O cycle it is attempting to eliminate is much shorter, particularly on flash memory devices.
 
==Compression using quantization==
Accelerator designers exploit quantization to reduce the bitwidth
of values and reduce the cost of data movement. However,
any value that does not fit in the reduced bitwidth results in
an overflow (we refer to these values as outliers). Therefore
accelerators use quantization for applications that are tolerant
to overflows. In most applications the rate of
outliers is low and values are often within a narrow range <ref name="Quant"/>
providing the opportunity to exploit quantization in generalpurpose processors. However, a software implementation of
quantization in general-purpose processors has three problems.
First, the programmer has to manually implement conversions
and the additional instructions that quantize and dequantize
values, imposing a programmer's effort and performance overhead. Second, to cover outliers, the bitwidth of the quantized
values often become greater than or equal to the original
values. Third, the programmer has to use standard bitwidth;
otherwise, extracting non-standard bitwidth (i.e., 1–7, 9–15, and
17–31) for representing narrow integers exacerbates the overhead
of software-based quantization. A hardware support in the memory hierarchy of
general-purpose processors for quantization can solve these problems. Tha hardware support allows representing
values by few and flexible numbers of bits and storing outliers
in their original format in a separate space, preventing any
overflow. It minimizes metadata and the overhead of locating
quantized values using a software-hardware interaction that
transfers quantization parameters and data layout to hardware.
As a result, transparent hardware-based quantization has three advantages over cache
compression techniques: (i) less metadata, (ii) higher compression
ratio for floating-point values and cache blocks with multiple data
types, and (iii) lower overhead for locating the compressed blocks.<ref name="Quant"/>
 
==History==
Line 74 ⟶ 50:
===Origins===
 
[[Acorn Computers]]' Unix variant, [[RISC iX]], was supplied as the primary operating system for its R140 workstation released in 1989.<ref name="acornuser198912">{{cite magazine | url=https://archive.org/details/AcornUser089-Dec89/page/n67/mode/2up | magazine=Acorn User | title=Power to the People | last1=Cox | first1=James | date=December 1989 | access-date=6 September 2020 | pages=66-67,69,71}}</ref> RISC iX provided support for demand paging of compressed executable files. However, the principal motivation for providing compressed executablesexecutable files was to be able to accommodate a complete Unix system in a hard disk of relatively modest size. Compressed data was not paged out to disk under this scheme.<ref name="taunton1991">{{cite journalbook | chapter-url=https://archive.org/details/1991-proceedings-tech-conference-nashville/page/385/mode/1up | journaltitle=Proceedings of the Summer 1991 USENIX Conference, Nashville, TN, USA, June 1991 | publisher=USENIX Association | year=1991 | last1=Taunton | first1=Mark | titlechapter=Compressed Executables: An Exercise in Thinking Small | pages=385–404}}</ref><ref name="taunton1991_unix_internals">{{cite newsgroup | title=Compressed executables | date=22 January 1991 | access-date=10 October 2020 | last1=Taunton | first1=Mark | newsgroup=comp.unix.internals | message-id=4743@acorn.co.uk | url=https://groups.google.com/d/msg/comp.unix.internals/mGP6CTNdfDI/4NKaA4_rIxgJ}}</ref>
 
Paul R. Wilson proposed compressed caching of virtual memory pages in 1990, in a paper circulated at the ACM OOPSLA/ECOOP '90 Workshop on Garbage Collection ("Some Issues and Strategies in Heap Management and Memory Hierarchies"), and appearing in ACM SIGPLAN Notices in January 1991.<ref name ="WilsonIssuesStrategies"/>
Line 80 ⟶ 56:
[[Helix Software Company]] pioneered virtual memory compression in 1992, filing a patent application for the process in October of that year.<ref name="PAT-5559978"/> In 1994 and 1995, Helix refined the process using test-compression and secondary memory caches on video cards and other devices.<ref name="PAT-5785474"/> However, Helix did not release a product incorporating virtual memory compression until July 1996 and the release of Hurricane&nbsp;2.0, which used the [[Stac Electronics]] [[Lempel–Ziv–Stac]] compression algorithm and also used off-screen video RAM as a compression buffer to gain performance benefits.<ref name="PCMAG-HURR-2"/>
 
In 1995, RAM cost nearly $50 per [[megabyte]], and [[Microsoft]]'s [[Windows&nbsp; 95]] listed a minimum requirement of 4&nbsp;MB of RAM.<ref name="WIN95-REQ"/> Due to the high RAM requirement, several programs were released which claimed to use compression technology to gain “memory”. Most notorious was the [[SoftRAM]] program from Syncronys Softcorp. SoftRAM was revealedexposed toas befake “placebobecause software” whichit did not includeperform any compression technology at all.<ref name="SoftRAM"/><ref name="Paul_1997_NWDOSTIP"/> Other products, including Hurricane and [[MagnaRAM]], included virtual memory compression, but implemented only [[run-length encoding]], with poor results, giving the technology a negative reputation.<ref name="PCMAG-PERF"/>
 
In its 8 April 1997 issue, PC Magazine published a comprehensive test of the performance enhancement claims of several software virtual memory compression tools. In its testing PC Magazine found a minimal (5% overall) performance improvement from the use of Hurricane, and none at all from any of the other packages.<ref name="PCMAG-PERF"/> However the tests were run on Intel [[Pentium]] systems which had a single core and were single threaded, and thus compression directly impacted all system activity.
Line 87 ⟶ 63:
 
===Recent developments===
* In early 2008, a [[Linux]] project named [[zram]] (originally called compcache) was released; in a 2013 update, it was incorporated into [[Chrome&nbsp;OSChromeOS]]<ref name="zram-google-page"/> and [[Android (operating system)|Android]]&nbsp; 4.4
* In 2010, IBM released Active Memory Expansion (AME) for [[AIX]] 6.1 which implements virtual memory compression.<ref name="IBM-AIX-AME"/>
* In 2012, some versions of the [[POWER7]]+ chip included AME hardware accelerators using the [[842 (compression algorithm)|842 compression algorithm]] for data compression support, used on AIX, for virtual memory compression.<ref name="IBM-POWER7+"/> More recent POWER processors continue to support the feature.
* In December 2012, the [[zswap]] project was announced; it was merged into the [[Linux kernel mainline]] in September 2013.
* In June 2013, Apple announced that it would include virtual memory compression in [[OS&nbsp; X Mavericks]], using the Wilson-Kaplan WKdm algorithm.<ref name="Arstechnica"/><ref name="Willson_Usenix"/>
* A 10 August 2015 "[[Windows Insider]] Preview" update for [[Windows&nbsp;10]] (build 10525) added support for RAM compression.<ref name="Aul_2015"/>
 
Line 103 ⟶ 79:
<ref name="WilsonIssuesStrategies">
{{cite journal |author-last=Wilson |author-first=Paul R. |title=Some Issues and Strategies in Heap Management and Memory Hierarchies |journal=ACM SIGPLAN Notices |date=1991 |volume=26 |issue=3 |pages=45–52 |doi=10.1145/122167.122173|s2cid=15404854 }}</ref>
<ref name="PAT-5559978">{{cite patent |country=US |number=5559978|pubdate=1996-09-24|title=Method for increasing the efficiency of a virtual memory system by selective compression of RAM memory contents|statusassign1=patent[[Helix_Software_Company|Helix Software Co., Inc.]]|inventor1-last=Spilo|inventor1-first=Michael L.}}</ref>
<ref name="PAT-5785474">{{cite patent |country=US |number=5875474|pubdate=1999-02-23|title=Method for caching virtual memory paging and disk input/output requests using off screen video memory|statusassign1=patent[[Helix_Software_Company|Helix Software Co., Inc.]]|inventor1-last=Fabrizio|inventor1-first=Daniel|inventor2-last=Spilo|inventor2-first=Michael L.}}</ref>
<ref name="CaseForCompressedCaching">{{cite conference |url=https://www.usenix.org/legacy/event/usenix99/full_papers/wilson/wilson.pdf |title=The Case for Compressed Caching in Virtual Memory Systems |author-last1=Wilson |author-first1=Paul R. |author-last2=Kaplan |author-first2=Scott F. |author-last3=Smaragdakis |author-first3=Yannis |date= 1999-06-06 |conference=USENIX Annual Technical Conference |___location=Monterey, California, USA |pages=101–116}}</ref>
<ref name="SIMPSON">{{cite web |author-last=Simpson |author-first=Matthew |title=Analysis of Compression Algorithms for Program Data |date=2014 |url=http://www.ece.umd.edu/~barua/matt-compress-tr.pdf |access-date=2015-01-09 |pages=64-14}}</ref>
<ref name="RIZZO">{{cite journal |author-last=Rizzo |author-first=Luigi |title=A very fast algorithm for RAM compression |journal=ACM SIGOPS Operating Systems Review |date=1996 |volume=31 |issue=2 |url=http://dl.acm.org/citation.cfm?id=250012 |access-date=2015-01-09 |page=8|doi=10.1145/250007.250012 |s2cid=18563587 |url-access=subscription }}</ref>
<ref name="DENNING">{{cite journal |author-last=Denning |author-first=Peter J. |title=Thrashing: Its causes and prevention |journal=Proceedings AFIPS, Fall Joint Computer Conference |date=1968 |url=http://www.cs.uwaterloo.ca/~brecht/courses/702/Possible-Readings/vm-and-gc/thrashing-denning-afips-1968.pdf |access-date=2015-01-05 |page=918 |volume=33}}</ref>
<ref name="FREEDMAN">{{cite web |author-last=Freedman |author-first=Michael J. |title=The Compression Cache: Virtual Memory Compression for Handheld Computers |url=http://www.cs.princeton.edu/~mfreed//docs/6.033/compression.pdf |date=2000-03-16 |access-date=2015-01-09}}</ref>
<ref name="CWORLD-RD2">{{cite bookjournal |url=https://books.google.com/books?id=BUaIcc6lsdwC&pg=PA56 |title=Mac Memory Booster Gets an Upgrade |journal=[[Computerworld]] |publisher=ComputerWorldIDG MagazineEnterprise |date=9 September 1996 |issn=0010-09-094841 |volume=30 |number =37 |page=56 |access-date=2015-01-12}}</ref>
<ref name="PCMAG-HURR-2">{{cite journal |url=https://books.google.com/books?id=7WGv1D0tOVYC&pg=PA48 |title=Hurricane 2.0 Squeezes the Most Memory from Your System |journal=[[PC Magazine]] |date=1996-10-08 |access-date=2015-01-01}}</ref>
<ref name="Quant">{{cite journal|author-last=Lenjani |author-first=Marzieh | url=http://www.cs.virginia.edu/~es9bt/papers/iiswc2019_Quantized.pdf |title=An Overflow-free Quantized Memory Hierarchy in General-purpose Processors |journal=[[In Proceedings of the IEEE International Symposium on Workload Characterization]] |date=2019-11-03 |access-date=2020-03-16}}</ref>
<ref name="PCMAG-PERF">{{cite journal |url=https://books.google.com/books?id=8RSHdk84u50C&pg=RA1-PA165 |title=Performance Enhancers |journal=[[PC Magazine]] |date=1997-04-08 |access-date=2015-01-01}}</ref>
<ref name="SoftRAM">{{cite journal |url=https://books.google.com/books?id=XcEKP0ml18EC&pg=PA34 |title=SoftRAM Under Scruitny |journal=[[PC Magazine]] |date=1996-01-23 |access-date=2015-01-01}}</ref>
<ref name="IBM-MXT-PERF">{{cite web |url=http://www.kkant.net/papers/caecw.doc |title=An Evaluation of Memory Compression Alternatives |author-first=Krishna |author-last=Kant |publisher=[[Intel Corporation]] |date=2003-02-01 |access-date=2015-01-01}}</ref>
<ref name="IBM-MXT-NEWS">{{cite web |url=http://www-03.ibm.com/press/us/en/pressrelease/1653.wss |archive-url=https://web.archive.org/web/20130622050529/http://www-03.ibm.com/press/us/en/pressrelease/1653.wss |url-status=dead |archive-date=22 June 2013 |title=IBM Research Breakthrough Doubles Computer Memory Capacity |publisher=[[IBM]] |date=2000-06-26 |access-date=2015-01-01}}</ref>
<ref name="IBM-MXT-PAPERS">{{cite web |url=http://researcher.watson.ibm.com/researcher/view_group_pubs.php?grp=2917 |title=Memory eXpansion Technologies |publisher=[[IBM]] |access-date=2015-01-01}}</ref>
<ref name="zswap-bench">{{cite web |url = https://events.linuxfoundation.org/sites/events/files/slides/tmc_sjennings_linuxcon2013.pdf |title=Transparent Memory Compression in Linux |author-first=Seth |author-last=Jennings |website=linuxfoundation.org |access-date=2015-01-01 |archive-date=2015-01-04 |archive-url=https://web.archive.org/web/20150104214723/https://events.linuxfoundation.org/sites/events/files/slides/tmc_sjennings_linuxcon2013.pdf |url-status=dead }}</ref>
<ref name="zram-google-page">{{cite web |url=https://code.google.com/p/compcache/ |title=CompCache |publisher=Google code |access-date=2015-01-01}}</ref>
<ref name="IBM-AIX-AME">{{cite web |url=https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101633 |title=AIX 6.1 Active Memory Expansion |publisher=[[IBM]] |access-date=2015-01-01 |archive-date=2015-01-04 |archive-url=https://web.archive.org/web/20150104210445/https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101633 |url-status=dead }}</ref>
<ref name="IBM-POWER7+">{{cite web |url=http://www-05.ibm.com/cz/events/febannouncement2012/pdf/power_architecture.pdf |title=IBM Power Systems Hardware Deep Dive |publisher=[[IBM]] |access-date=2015-01-01 |archive-date=2015-01-04 |archive-url=https://web.archive.org/web/20150104205645/http://www-05.ibm.com/cz/events/febannouncement2012/pdf/power_architecture.pdf |url-status=dead }}</ref>
<ref name="ZRAM-BENCH">{{cite web |url=https://code.google.com/p/compcache/wiki/Performance |title=Performance numbers for compcache |access-date=2015-01-01}}</ref>
<ref name="WIN95-REQ">{{cite web |url=http://support.microsoft.com/kb/138349/en-us |title=Windows 95 Installation Requirements |publisher=[[Microsoft]] |access-date=2015-01-01}}</ref>
<ref name="Arstechnica">{{Cite web|url=https://arstechnica.com/apple/2013/10/os-x-10-9/17/#compressed-memory|title=OS X 10.9 Mavericks: The Ars Technica Review|date=22 October 2013}}</ref>
<ref name="Willson_Usenix">{{Cite web|url=https://www.usenix.org/legacy/publications/library/proceedings/usenix01/cfp/wilson/wilson_html/acc.html|title = The Case for Compressed Caching in Virtual Memory Systems}}</ref>
<ref name="Aul_2015">{{cite web |author-last=Aul |author-first=Gabe |url=httphttps://blogs.windows.com/bloggingwindowswindows-insider/2015/08/18/announcing-windows-10-insider-preview-build-10525/ |title=Announcing Windows 10 Insider Preview Build 10525 |work=Blogging Windows Insider Blog |publisher=[[Microsoft]] |date=2015-08-18 |access-date=20152024-08-1903}}</ref>
<ref name="Paul_1997_NWDOSTIP">{{cite book |title=NWDOS-TIPs &mdash; Tips & Tricks rund um Novell DOS 7, mit Blick auf undokumentierte Details, Bugs und Workarounds |chapter=Kapitel II.18. Mit STACKER Hauptspeicher 'virtuell' verdoppeln… |language=de |trans-title=NWDOS-TIPs &mdash; Tips & tricks for Novell DOS 7, with a focus on undocumented details, bugs and workarounds |trans-chapter=Utilizing [[Stacker (disk compression)|STACKER]] to 'virtually' double main memory… |work=MPDOSTIP |author-first=Matthias R. |author-last=Paul |date=1997-07-30 |orig-year=1996-04-14 |edition=3 |version=Release 157 |url=http://www.antonis.de/dos/dos-tuts/mpdostip/html/nwdostip.htm |access-date=2012-01-11 |url-status=live |archive-url=https://web.archive.org/web/20161105172944/http://www.antonis.de/dos/dos-tuts/mpdostip/html/nwdostip.htm |archive-date=2016-11-05}}</ref>
}}
 
Line 133 ⟶ 108:
{{Operating system}}
 
{{DEFAULTSORT[[Category:Virtual Memory}}memory]]
[[Category:Virtual memory|virtual memory compression]]