Revision as of 07:05, 22 January 2020 edit Citation bot (talk \| contribs) Bots 5,866,781 edits m Add: url. \| You can use this bot yourself. Report bugs here. \| Activated by User:Neko-chan \| Category:Pages with DOIs inactive as of 2019 February \| via #UCB_Category ← Previous edit		Revision as of 07:15, 22 January 2020 edit undo Neko-chan (talk \| contribs) Extended confirmed users, New page reviewers, Rollbackers 21,562 edits cites Next edit →
Line 20: === Stream buffers === * Stream buffers were developed based on the concept of "one block lookahead (OBL) scheme" proposed by [[Alan Jay Smith]].<ref name=":3" /> * Stream [[Data buffer\|buffers]] are one of the most common hardware based prefetching techniques in use.<ref>{{Cite journal\|last=Mittal\|first=Sparsh\|date=2016-08-01\|title=A Survey of Recent Prefetching Techniques for Processor Caches\|journal=ACM Comput. Surv.\|volume=49\|issue=2\|pages=35:1–35:35\|doi=10.1145/2907071\|issn=0360-0300\|url=https://zenodo.org/record/1236174}}</ref> This technique was originally proposed by [[Norman Jouppi]] in 1990<ref name=":1">{{~~Cite~~cite conference ~~journal~~\| last=Jouppi \| first=Norman P. \| title=Improving ~~Direct~~direct-~~Mapped~~mapped ~~Cache~~cache ~~Performance~~performance by the ~~Addition~~addition of a ~~Small~~small ~~Fully~~fully-~~Associative~~associative ~~Cache~~cache and ~~Prefetch~~prefetch buffers ~~Buffers~~\|~~url~~ publisher=~~http://citeseer.ist.psu.edu/viewdoc/summary?doi~~ACM Press \| publication-place=~~10.1.1.37.6114~~New York, New York, USA \| year=1990 \|~~access-date~~ isbn=~~2016~~0-1089791-17366-3 \|~~archive-url~~ doi=~~https://web~~10.~~archive.org/web/20161020114759/http:/~~1145/~~citeseer~~325164.~~ist.psu.edu/viewdoc/summary?doi~~325162 \|citeseerx=10.1.1.37.6114~~\|archive-date=2016-10-20\|url-status=live~~}}</ref> and many variations of this method have been developed since.<ref>{{Cite journal\|last=Chen\|first=Tien-Fu\|last2=Baer\|first2=Jean-Loup\|date=1995-05-01\|title=Effective hardware-based data prefetching for high-performance processors\|journal=IEEE Transactions on Computers\|volume=44\|issue=5\|pages=609–623\|doi=10.1109/12.381947\|issn=0018-9340\|url=https://semanticscholar.org/paper/bc2bba7e1bb4e7d8307aa36bdc5ee86cdd61cc58}}</ref><ref>{{Cite conference\|last=Palacharla\|first=S.\|last2=Kessler\|first2=R. E.\|date=1994-01-01\|title=Evaluating Stream Buffers As a Secondary Cache Replacement\|conference=21st Annual International Symposium on Computer Architecture\|___location=Chicago, IL, USA\|publisher=IEEE Computer Society Press\|pages=24–33\|doi=10.1145/191995.192014\|isbn=978-0818655104\|doi-broken-date=2019-02-19\|citeseerx=10.1.1.92.3031}}</ref><ref>{{~~Cite~~cite journal\| last=Grannaes \| first=Marius \| last2=Jahre \| first2=Magnus \| last3=Natvig \| first3=Lasse \| title=Storage Efficient Hardware Prefetching using Delta-Correlating Prediction Tables \|citeseerx=10.1.1.229.3483 \|journal=Journal of Instruction-Level Parallelism \|issue=13 \|year=2011 \|pages=1-16}}</ref> The basic idea is that the [[cache miss]] address (and <math>k</math> subsequent addresses) are fetched into a separate buffer of depth <math>k</math>. This buffer is called a stream buffer and is separate from cache. The processor then consumes data/instructions from the stream buffer if the address associated with the prefetched blocks match the requested address generated by the program executing on the processor. The figure below illustrates this setup: [[File:CachePrefetching_StreamBuffers.png\|center\|<ref name=":1"/> A typical stream buffer setup as originally proposed by Normal Jouppi in 1990\|alt=A typical stream buffer setup as originally proposed\|thumb\|400x400px]] * Whenever the prefetch mechanism detects a miss on a memory block, say A, it allocates a stream to begin prefetching successive blocks from the missed block onward. If the stream buffer can hold 4 blocks, then we would prefetch A+1, A+2, A+3, A+4 and hold those in the allocated stream buffer. If the processor consumes A+1 next, then it shall be moved "up" from the stream buffer to the processor's cache. The first entry of the stream buffer would now be A+2 and so on. This pattern of prefetching successive blocks is called '''Sequential Prefetching'''. It is mainly used when contiguous locations are to be prefetched. For example, it is used when prefetching instructions.

Cache prefetching: Difference between revisions