Cache (computing): Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 14:09, 13 November 2024 edit Kvng (talk \| contribs) Extended confirmed users, New page reviewers 115,948 edits m Reverted 1 edit by 154.74.127.182 (talk) to last revision by Kvng Tags: Twinkle Undo ← Previous edit		Latest revision as of 09:39, 14 August 2025 edit undo Onel5969 (talk \| contribs) Autopatrolled, Extended confirmed users, Page movers, New page reviewers, Pending changes reviewers, Rollbackers 992,375 edits m Disambiguating links to General-purpose computing on graphics processing units (link changed to General-purpose computing on graphics processing units (software)) using DisamAssist.
(19 intermediate revisions by 11 users not shown)
Line 7: In [[computing]], a '''cache''' ({{IPAc-en\|audio=LL-Q1860 (eng)-Back ache-cache.wav\|k\|æ\|ʃ}} {{respell\|KASH}})<ref> {{cite web\|url=http://www.oxforddictionaries.com/definition/english/cache\|archive-url=https://web.archive.org/web/20120818122040/http://oxforddictionaries.com/definition/english/cache\|url-status=dead\|archive-date=18 August 2012\|title=Cache\|work=Oxford Dictionaries\|access-date=2 August 2016}}</ref> is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A '''cache hit''' occurs when the requested data can be found in a cache, while a '''cache miss''' occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs.<ref>{{Cite journal\|last1=Zhong\|first1=Liang\|last2=Zheng\|first2=Xueqian\|last3=Liu\|first3=Yong\|last4=Wang\|first4=Mengting\|last5=Cao\|first5=Yang\|date=February 2020\|title=Cache hit ratio maximization in device-to-device communications overlaying cellular networks\|url=http://dx.doi.org/10.23919/jcc.2020.02.018\|journal=China Communications\|volume=17\|issue=2\|pages=232–238\|doi=10.23919/jcc.2020.02.018\|s2cid=212649328\|issn=1673-5447\|url-access=subscription}}</ref> To be cost-effective, caches must be relatively small. Nevertheless, caches are effective in many areas of computing because typical [[Application software\|computer applications]] access data with a high degree of [[locality of reference]]. Such access patterns exhibit temporal locality, where data is requested that has been recently requested, and spatial locality, where data is requested that is stored near data that has already been requested. Line 31: During a cache miss, some other previously existing cache entry is typically removed in order to make room for the newly retrieved data. The [[Heuristic (computer science)\|heuristic]] used to select the entry to replace is known as the [[Cache replacement policies\|replacement policy]]. One popular replacement policy, least recently used (LRU), replaces the oldest entry, the entry that was accessed less recently than any other entry. More sophisticated caching algorithms also take into account the frequency of use of entries. ==={{Anchor\|Dirty\|WRITEPOLICIES\|WRITE-BACK\|WRITE-BEHIND\|WRITE-THROUGH\|WRITE-AROUND}}~~Writing~~Write policies=== [[File:Write-through with no-write-allocation.svg\|thumb\|380px\|A write-through cache without write allocation]] [[File:Write-back with write-allocation.svg\|thumb\|500px\|A write-back cache with write allocation]] ~~When a system~~Cache writes ~~data to cache, it~~ must ateventually ~~some~~be ~~point write that data~~propagated to the backing store ~~as well~~. The timing offor this ~~write~~ is ~~controlled~~governed by ~~what is known as~~ the ''write policy''. ~~There are~~The two ~~basic~~primary ~~writing~~write ~~approaches~~policies are:<ref>{{Cite web\|url=https://www.linuxjournal.com/article/7105\|title=Understanding Caching\|last=Bottomley\|first=James\|date=2004-01-01\|website=Linux Journal\|access-date=2019-10-01}}</ref> * ''Write-through'': ~~write~~Writes isare ~~done~~performed synchronously to both to the cache and to the backing store. * ''Write-back'': ~~initially~~Initially, writing is done only to the cache. The write to the backing store is postponed until the modified content is about to be replaced by another cache block. A write-back cache is more complex to implement since it needs to track which of its locations have been written over and mark them as ''dirty'' for later writing to the backing store. The data in these locations are written back to the backing store only when they are evicted from the cache, a process referred to as a ''lazy write''. For this reason, a read miss in a write-back cache ~~will often~~may require two memory ~~backing store~~ accesses to ~~service~~the backing store: one ~~for the~~to write back the dirty data, and one to retrieve the ~~needed~~requested data. Other policies may also trigger data write-back. The client may make many changes to data in the cache, and then explicitly notify the cache to write back the data. ~~Since~~Write nooperations ~~data~~do isnot ~~returned~~return todata. ~~the requester on write operations~~Consequently, a decision needs to be made for write misses: whether or not ~~data~~to ~~would~~load bethe ~~loaded~~data into the cache. onThis is determined by these ''write-miss ~~misses.~~policies'': * ''Write allocate'' (also called ''fetch on write''): ~~data~~Data at the missed-write ___location is loaded to cache, followed by a write-hit operation. In this approach, write misses are similar to read misses. * ''No-write allocate'' (also called ''write-no-allocate'' or ''write around''): ~~data~~Data at the missed-write ___location is not loaded to cache, and is written directly to the backing store. In this approach, data is loaded into the cache on read misses only. ~~Both~~While ~~write-through and~~both write~~-back~~ policies can ~~use~~Implement either ~~of these~~ write-miss ~~policies~~policy, ~~but usually~~ they are typically paired. as follows:<ref name="HennessyPatterson2011">{{cite book\|last1=Hennessy\|first1=John L.\|url=https://books.google.com/books?id=v3-1hVwHnHwC&pg=SL2-PA12\|title=Computer Architecture: A Quantitative Approach\|last2=Patterson\|first2=David A.\|publisher=Elsevier\|year=2011\|isbn=978-0-12-383872-8\|page=B–12\|language=en}}</ref><ref>{{cite book\|title=Computer Architecture A Quantitative Approach\|last1=Patterson\|first1=David A.\|last2=Hennessy\|first2=John L.\|isbn=1-55860-069-8\|date=1990\|page=413\|publisher=Morgan Kaufmann Publishers}}</ref> * A write-back cache ~~uses~~typically employs write allocate, ~~hoping~~anticipating ~~for~~that subsequent writes (or ~~even~~ reads) to the same ___location, ~~which~~will isbenefit ~~now~~from having the data already in the ~~cached~~cache. * A write-through cache uses no-write allocate. Here, subsequent writes have no advantage, since they still need to be written directly to the backing store. Entities other than the cache may change the data in the backing store, in which case the copy in the cache may become out-of-date or ''stale''. Alternatively, when the client updates the data in the cache, copies of ~~those~~that data in other caches will become stale. Communication protocols between the cache managers that keep the data consistent are associated with [[cache coherence]]. ===Prefetch=== Line 69: Earlier [[graphics processing unit]]s (GPUs) often had limited read-only [[texture cache]]s and used [[swizzling (computer graphics)\|swizzling]] to improve 2D [[locality of reference]]. [[Cache miss]]es would drastically affect performance, e.g. if [[mipmapping]] was not used. Caching was important to leverage 32-bit (and wider) transfers for texture data that was often as little as 4 bits per pixel. As GPUs advanced, supporting [[General-purpose computing on graphics processing units (software)\|general-purpose computing on graphics processing units]] and [[compute kernel]]s, they have developed progressively larger and increasingly general caches, including [[instruction cache]]s for [[shader]]s, exhibiting functionality commonly found in CPU caches. These caches have grown to handle [[synchronization primitive]]s between threads and [[atomic operation]]s, and interface with a CPU-style MMU. ===DSPs=== Line 88: The time aware least recently used (TLRU) is a variant of LRU designed for the situation where the stored contents in cache have a valid lifetime. The algorithm is suitable in network cache applications, such as ICN, [[content delivery network]]s (CDNs) and distributed networks in general. TLRU introduces a new term: time to use (TTU). TTU is a time stamp on content which stipulates the usability time for the content based on the locality of the content and information from the content publisher. Owing to this locality-based time stamp, TTU provides more control to the local administrator to regulate in-network storage. In the TLRU algorithm, when a piece of content arrives, a cache node calculates the local TTU value based on the TTU value assigned by the content publisher. The local TTU value is calculated by using a locally- defined function. Once the local TTU value is calculated the replacement of content is performed on a subset of the total content stored in cache node. The TLRU ensures that less popular and short-lived content should be replaced with incoming content.<ref>{{cite conference\|last1=Bilal\|first1=Muhammad\|first2=Shin-Gak\|last2=Kang\|conference=16th International Conference on Advanced Communication Technology\|title=Time Aware Least Recent Used (TLRU) cache management policy in ICN\|year=2014\|pages=528–532\|doi=10.1109/ICACT.2014.6779016\|arxiv=1801.00390\|bibcode=2018arXiv180100390B\|isbn=978-89-968650-3-2\|s2cid=830503}}</ref> =====Least frequent recently used===== The least frequent recently used (LFRU) cache replacement scheme combines the benefits of LFU and LRU schemes. LFRU is suitable for network cache applications, such as ICN, CDNs and distributed networks in general. In LFRU, the cache is divided into two partitions called privileged and unprivileged partitions. The privileged partition can be seen as a protected partition. If content is highly popular, it is pushed into the privileged partition. Replacement of the privileged partition is done by first evicting content from the unprivileged partition, then pushing content from the privileged partition to the unprivileged partition, and finally inserting new content into the privileged partition. In the above procedure, the LRU is used for the privileged partition and an approximated LFU (ALFU) scheme is used for the unprivileged partition. The basic idea is to cache the locally popular content with the ALFU scheme and push the popular content to the privileged partition.<ref>{{cite journal\|author=Bilal, Muhammad\|display-authors=etal\|title=A Cache Management Scheme for Efficient Content Eviction and Replication in Cache Networks\|journal=IEEE Access\|volume=5\|pages=1692–1701\|arxiv=1702.04078\|bibcode=2017arXiv170204078B\|year=2017\|doi=10.1109/ACCESS.2017.2669344\|s2cid=14517299}}</ref~~><!--[[User:Kvng/RTH]]--~~> ====Weather forecast==== In 2011, the use of smartphones with weather forecasting options was overly taxing [[AccuWeather]] servers; two requests ~~within~~from the same ~~park~~area would generate separate requests. An optimization by edge-servers to truncate the GPS coordinates to fewer decimal places meant that the cached results from ~~the~~a ~~earlier~~nearby query would be used. The number of to-the-server lookups per day dropped by half.<ref>{{cite magazine\|author=Murphy\|first=Chris\|date=May 30, 2011\|title=5 Lines Of Code In The Cloud\|magazine=[[InformationWeek]]\|page=28\|quote=300 million to 500 million fewer requests a day handled by AccuWeather servers}}</ref> ==Software caches== ===Disk cache=== While CPU caches are generally managed entirely by hardware, a variety of software manages other caches. The [[page cache]] in main memory~~, which is an example of disk cache,~~ is managed by the ~~operating system~~ [[~~Kernel (~~operating system)\| kernel]].▼ ~~{{Main\|Page cache}}~~ ▲While CPU caches are generally managed entirely by hardware, a variety of software manages other caches. The [[page cache]] in main memory, which is an example of disk cache, is managed by the operating system [[Kernel (operating system)\|kernel]]. While the [[disk buffer]], which is an integrated part of the hard disk drive or solid state drive, is sometimes misleadingly referred to as "''disk cache"'', its main functions are write sequencing and read prefetching. ~~Repeated cache hits are relatively rare, due to the small size of the buffer in comparison to the drive's capacity. However, high~~High-end [[disk controller]]s often have their own on-board cache offor the hard disk drive's data blocks. Finally, a fast local hard disk drive can also cache information held on even slower data storage devices, such as remote servers ([[web cache]]) or local [[tape drive]]s or [[optical jukebox]]es; such a scheme is the main concept of [[hierarchical storage management]]. Also, fast flash-based solid-state drives (SSDs) can be used as caches for slower rotational-media hard disk drives, working together as [[hybrid drive]]s ~~or [[solid-state hybrid drive]]s (SSHDs)~~. ===Web cache=== {{Main\|Web cache}} Web browsers and [[~~Proxy server\|~~web proxy ~~servers~~server]]s, either locally or at the [[Internet service provider]] (ISP), employ web caches to store previous responses from web servers, such as [[web page]]s and [[image file format\|image]]s. Web caches reduce the amount of information that needs to be transmitted across the network, as information previously stored in the cache can often be re-used. This reduces bandwidth and processing requirements of the web server, and helps to improve [[responsiveness]] for users of the web.<ref>{{cite web\|url=http://docforge.com/wiki/Web_application/Caching\|title=Web application caching\|author=Multiple (wiki)\|work=Docforge\|access-date=2013-07-24\|archive-date=12 December 2019\|archive-url=https://web.archive.org/web/20191212152625/http://www.docforge.com/wiki/Web_application/Caching\|url-status=dead}}</ref> Web browsers employ a built-in web cache, but some [[Internet service provider]]s (ISPs) or organizations also use a caching proxy server, which is a web cache that is shared among all users of that network. Another form of cache is [[P2P caching]], where the files most sought for by [[peer-to-peer]] applications are stored in an ISP cache to accelerate P2P transfers. Similarly, decentralised equivalents exist, which allow communities to perform the same task for P2P traffic, for example, Corelli.<ref>{{cite conference\|last1=Tyson\|first1=Gareth\|last2=Mauthe\|first2=Andreas\|last3=Kaune\|first3=Sebastian\|last4=Mu\|first4=Mu\|last5=Plagemann\|first5=Thomas\|title=Corelli: A Dynamic Replication Service for Supporting Latency-Dependent Content in Community Networks\|url=http://comp.eprints.lancs.ac.uk/2044/1/MMCN09.pdf\|conference=MMCN'09\|archive-url=https://web.archive.org/web/20150618193018/http://comp.eprints.lancs.ac.uk/2044/1/MMCN09.pdf\|archive-date=2015-06-18}}</ref> Line 118 ⟶ 115: ===Content delivery network=== {{~~Main~~main\|Content delivery network}} A [[content delivery network]] (CDN) is a network of distributed servers that deliver pages and other ~~Web~~[[web content]] to a user, based on the geographic locations of the user, the origin of the web page and the content delivery server. CDNs ~~began~~were introduced in the late 1990s as a way to speed up the delivery of static content, such as HTML pages, images and videos. By replicating content on multiple servers around the world and delivering it to users based on their ___location, CDNs can significantly improve the speed and availability of a website or application. When a user requests a piece of content, the CDN will check to see if it has a copy of the content in its cache. If it does, the CDN will deliver the content to the user from the cache.<ref name=":0">{{cite web\|url=https://people.cs.umass.edu/~ramesh/Site/PUBLICATIONS_files/DMPPSW02.pdf\|title=Globally Distributed Content Delivery, by J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman and B. Weihl, IEEE Internet Computing, Volume 6, Issue 5, November 2002.\|access-date=2019-10-25\|archive-url=https://web.archive.org/web/20170809231307/http://people.cs.umass.edu/~ramesh/Site/PUBLICATIONS_files/DMPPSW02.pdf\|archive-date=2017-08-09\|url-status=live}}</ref> ===Cloud storage gateway=== {{Main\|Cloud storage gateway}} A [[cloud storage gateway~~, also known as an edge filer,~~]] is a [[hybrid cloud storage]] device that connects a local network to one or more [[cloud storage service]]s, typically [[object storage]] services such as [[Amazon S3]]. It provides a cache for frequently accessed data, providing high speed local access to frequently accessed data in the cloud storage service. Cloud storage gateways also provide additional benefits such as accessing cloud object storage through traditional file serving protocols as well as continued access to cached data during connectivity outages.<ref name="searchstorage1">{{cite web\|url=https://www.techtarget.com/searchstorage/definition/cloud-storage-gateway\|title=Definition: cloud storage gateway\|work=SearchStorage\|date=July 2014}}</ref> ===Other caches=== The ~~BIND~~ [[~~Domain Name System~~BIND\|BIND DNS daemon]] ~~daemon~~ caches a mapping of ___domain names to [[IP address]]es, as does a [[DNS resolver]] library. Write-through operation is common when operating over [[unreliable ~~networks (like an Ethernet LAN)~~network]]s, because of the enormous complexity of the coherency protocol required between multiple write-back caches when communication is unreliable. For instance, web page caches and [[client-side]] caches for [[~~Network File System\|network~~distributed file system]] ~~caches~~s (like those in [[Network File System (protocol)\|NFS]] or [[Server Message Block\|SMB]]) are typically read-only or write-through specifically to keep the network protocol simple and reliable. [[Web search ~~engine\|Search~~ engine]]s also frequently make web pages they have indexed available from their ~~cache.~~[[Search ~~For~~engine ~~example, [[Google~~cache\|cache]] ~~provides a "Cached" link next to each search result~~. This can prove useful when web pages from a web server are temporarily or permanently inaccessible. [[Database caching]] can substantially improve the throughput of [[database]] applications, for example in the processing of [[Database index\|indexes]], [[Data dictionary\|data dictionaries]], and frequently used subsets of data. A [[distributed cache]]<ref>{{cite journal\|last1=Paul\|first1=S.\|last2=Fei\|first2=Z.\|date=1 February 2001\|title=Distributed caching with centralized control\|journal=Computer Communications\|volume=24\|issue=2\|pages=256–268\|citeseerx=10.1.1.38.1094\|doi=10.1016/S0140-3664(00)00322-4}}~~<!--\|access-date=18 November 2009-->~~</ref> uses networked hosts to provide scalability, reliability and performance to the application.<ref>{{cite journal\|last=Khan\|first=Iqbal\|title=Distributed Caching on the Path To Scalability\|url=https://msdn.microsoft.com/magazine/dd942840.aspx\|journal=MSDN\|volume=24\|issue=7\|date=July 2009}}</ref> The hosts can be co-located or spread over different geographical regions.<!--[[User:Kvng/RTH]]--> =={{anchor\|The difference between buffer and cache}}<!--Former section name used in external links-->Buffer vs. cache==