Content deleted Content added
Quuxplusone (talk | contribs) copyedit, clean up |
m Dated {{Who}}. (Build ) |
||
Line 10:
The [[VMX]] (Vector Multimedia Extensions) technology is conceptually similar to the vector model provided by the SPU processors, but there are many significant differences.
{| class="wikitable" style="margin: 1em auto 1em auto"
|+ '''VMX to SPU Comparison'''{{ref|vmxrefman}}<!-- 333 pages --><br>''unfinished''
! feature || VMX || SPU
|-
! [[word (computer science)|word]] size
| 32 bits || 32 bits
|-
! number of [[register (computer science)|registers]]
| 32 <!-- p.28/333 --> || 128
<!-- p.34/333 also shows 32 GP and 32 FP regs, are these part of VMX? -->
|-
! register width
| 128-bit quadword <!-- p.28/333 --> || 128-bit quadword
|-
! [[integer]] formats
| 8, 16, 32 <!-- p.26/333 --> || 8, 16, 32, 64 <!-- checked: there is no doubleword add or mul instr. -->
|-
! saturation support
| yes <!-- p.26/333 --> || no <!-- check this -->
|-
! byte ordering
| big (default), little <!--p.44/333 --> || big endian
|-
! [[floating point (computer science)|floating point]] modes
| Java, non-Java || single precision, IEEE double
|-
! [[memory (computer science)|memory]] alignment
| quadword only || quadword only
|}
Line 45:
====Intrinsics====
Compilers for Cell{{
====Porting VMX code for SPU====
Line 59:
Transferring data between the local stores of different SPUs can have a large performance cost. The local stores of individual SPUs can be exploited using a variety of strategies.
Applications with high locality, such as dense matrix computations, represent an ideal workload class for the local stores in Cell BE.<ref>{{cite web|url=http://www.research.ibm.com/people/m/mikeg/papers/2006_ieeemicro.pdf|format=PDF|title=Synergistic Processing in Cell's Multicore Architecture|date=March 2006}}</ref>
Streaming computations can be efficiently accommodated using [[software pipelining]] of memory block transfers using a multi-buffering strategy.<ref name="research.ibm.com"/>
|