Revision as of 02:31, 12 December 2023 edit Maxeto0910 (talk \| contribs) Extended confirmed users 116,916 edits m →Stream buffers Tag: Visual edit ← Previous edit		Revision as of 22:28, 19 December 2023 edit undo David Eppstein (talk \| contribs) Autopatrolled, Administrators 235,660 edits Hyesoon Kim Next edit →
Line 22: * Whenever the prefetch mechanism detects a miss on a memory block, say A, it allocates a stream to begin prefetching successive blocks from the missed block onward. If the stream buffer can hold 4 blocks, then we would prefetch A+1, A+2, A+3, A+4 and hold those in the allocated stream buffer. If the processor consumes A+1 next, then it shall be moved "up" from the stream buffer to the processor's cache. The first entry of the stream buffer would now be A+2 and so on. This pattern of prefetching successive blocks is called '''Sequential Prefetching'''. It is mainly used when contiguous locations are to be prefetched. For example, it is used when prefetching instructions. * This mechanism can be scaled up by adding multiple such 'stream buffers' - each of which would maintain a separate prefetch stream.<ref>{{Cite conference \|last1=Ishii \|first1=Yasuo \|last2=Inaba \|first2=Mary \|last3=Hiraki \|first3=Kei \|date=2009-06-08 \|title=Access map pattern matching for data cache prefetch \|url=https://doi.org/10.1145/1542275.1542349 \|conference=ICS 2009 \|___location=New York, New York, USA \|publisher=Association for Computing Machinery \|pages=499–500 \|doi=10.1145/1542275.1542349 \|isbn=978-1-60558-498-0 \|book-title=Proceedings of the 23rd International Conference on Supercomputing \|s2cid=37841036}}</ref> For each new miss, there would be a new stream buffer allocated and it would operate in a similar way as described above. * The ideal depth of the stream buffer is something that is subject to experimentation against various benchmarks<ref name=":1" /> and depends on the rest of the [[microarchitecture]] involved.<ref>{{Cite conference \|last1=Srinath \|first1=Santhosh \|last2=Mutlu \|first2=Onur \|last3=Kim \|first3=Hyesoon \|author3-link=Hyesoon Kim\|last4=Patt \|first4=Yale N.\|author4-link=Yale Patt \|date=February 2007 \|title=Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers \|url=https://ieeexplore.ieee.org/document/4147648 \|conference=2007 IEEE 13th International Symposium on High Performance Computer Architecture \|pages=63–74 \|doi=10.1109/HPCA.2007.346185\|isbn=978-1-4244-0804-7 \|s2cid=6909725 }}</ref> === Strided prefetching ===

Cache prefetching: Difference between revisions