Content deleted Content added
No edit summary |
No edit summary |
||
Line 7:
<ref>{{cite web|title=enhancing cache coherent architectures with memory access patterns for embedded many-core systems|url=http://www.cc.gatech.edu/~bader/papers/EnhancingCache-SoC12.pdf}}</ref>
<ref>{{cite web|title=gpgpu scatter vs gather|url=http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter31.html}}</ref>
<ref>{{cite web|title=Analysis of Energy and Performance of Code Transformations for PGAS-based Data Access Patterns||url=http://nic.uoregon.edu/pgas14/papers/pgas14_submission_17.pdf}}
Computer memory is usually described as 'random access', but traversals by software will still exhibit patterns that can be exploited for efficiency.
Line 26:
<ref>{{cite web||title=Cray and HPCC:
Benchmark Developments and Results from the Past Year|url=https://cug.org/5-publications/proceedings_attendee_lists/2005CD/S05_Proceedings/pages/Authors/Wichmann/Wichmann_paper.pdf}}see global random access results for Cray X1. vector architecture for hiding latencies, not so sensitive to cache coherency</ref>
The [[Partitioned global address space|PGAS]] approach may help by sorting operations by data on the fly (useful when the problem *is* figuring out the locality of unsorted data).<ref>{{cite web|title=partitioned global address space programming|url=https://www.youtube.com/watch?v=NU4VfjISk2M}}covers cases where PGAS is a win, where data may not be already sorted, e.g. dealing with complex graphs</ref>
Data structures which rely heavily on [[pointer chasing]] can often produce poor locality of reference, although sorting can sometimes help.
|