Memory access pattern: Difference between revisions

Content deleted Content added
Yobot (talk | contribs)
m WP:CHECKWIKI error fixes using AWB (12023)
Line 47:
=== 2D Spatially coherent ===
In [[3D rendering]], access patterns for [[texture mapping]] and [[rasterization]] of small primitives (with arbitrary distortions of complex surfaces) are far from linear, but can still exhibit spatial locality (e.g. in [[screen space]] or [[texture space]]) . This can be turned into good ''memory'' locality via some combination of [[morton order]]
<ref>{{cite web|title=The Design and Analysis of a Cache Architecture for Texture Mapping|url=http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/readings/hakura97_texcaching.pdf}}see morton order,texture access pattern</ref>
Cache Architecture for Texture Mapping|url=http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/readings/hakura97_texcaching.pdf}}see morton order,texture access pattern</ref>
and [[Tiling (computer graphics)|tiling]] for [[texture map]]s and [[frame buffer]] data (mapping spatial regions onto cache lines), or by sorting primitives via [[tile based deferred rendering]].<ref>{{cite web|title=morton order to accelerate texturing|url=http://john.cs.olemiss.edu/~rhodes/papers/Nocentino10.pdf}}</ref> It can also be advantageous to store matrices in morton order in [[linear algebra libraries]].
<ref>{{cite web|title=Morton-order Matrices Deserve Compilers’ Support Technical Report 533|url=http://www.cs.indiana.edu/pub/techreports/TR533.pdf}}discusses the importance of morton order for matrices</ref>
Line 56 ⟶ 55:
A [[Scatter (vector addressing)|scatter]] memory access pattern combines sequential reads with indexed/random addressing for writes.
<ref name="gpu gems2">{{cite web|title=gpgpu scatter vs gather|url=http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter31.html}}</ref>
Compared to [[#GATHER|'gather]], It may place less load on a cache hierarchy since a [[processing element]] may dispatch writes in a 'fire and forget' manner (bypassing a cache altogether), whilst using predicatble prefetching (or even DMA) for it's source data.
 
However, it may be harder to parallelise since there is no guarantee the writes do not interact.,<ref name="gpu gems"/> and many systems are still designed assuming that a hardware cache will coalesce many small writes into larger ones.
Line 68 ⟶ 67:
In a [[gather (vector addressing)|gather]] memory access pattern, reads are randomly addressed or indexed, whilst the writes are sequential (or [[#LINEAR|linear]]).
<ref name="gpu gems2"/>
An example is found in [[inverse texture mapping]], where data can be written out linearly across scanlines, whilst random access texture addresses are calculated per pixel.
 
Compared to [[#SCATTER|scatter]], the disadvantage is that caching (and bypassing latencies) is now essential for reads, however it is easier to parallelise since the writes are guaranteed to not overlap. As such the gather approach is more common for [[gpgpu]] programming,<ref name="gpu gems"/> where the massive threading (enabled by parallelism) is used to hide read latencies.