Cell software development: Difference between revisions

Content deleted Content added
wikify
BoxerJon (talk | contribs)
No edit summary
 
(8 intermediate revisions by 7 users not shown)
Line 1:
'''Software development''' for the [[Cell microprocessor]] involves a mixture of conventional development practices for the [[Power ArchitecturePowerPC]]-compatible PPU core, and novel software development challenges with regard to the functionally reduced SPU coprocessors.
 
==Linux on Cell==
An open source software-based strategy was adopted to accelerate the development of a Cell BE ecosystem and to provide an environment to develop Cell applications, including a GCC-based Cell compiler, binutils and a port of the Linux operating system.<ref name="research.ibm.com">{{cite web|url=http://www.research.ibm.com/people/m/mikeg/papers/2007_ieeecomputer.pdf|format=PDF|title=An Open Source Environment for Cell Broadband Engine System Software|date=June 2007}}</ref>
 
==Octopiler==
'''Octopiler''' is [[IBM|IBM]]'s]] prototype [[compiler]] to allow [[software developer]]s to write [[software code|code]] for [[Cell processor]]s.<ref>[http://domino.research.ibm.com/comm/research_projects.nsf/pages/cellcompiler.index.html{{citation|title=Using IBMadvanced Researchcompiler Projecttechnology -to Compilerexploit Technologythe forperformance Scalableof Architectures]</ref><ref>{{citationthe Cell Broadband Engine architecture|date=2017-10-23|url=http://www.research.ibm.com/journal/sj/451/eichenberger.html |title=IBM Systems Journal - Using advanced compiler technology to exploit the performance of the Cell Broadband Engine architecture |archive-url=https://web.archive.org/web/20060411094457/http://www.research.ibm.com/journal/sj/451/eichenberger.html |archive-date=2006-04-11 |deadurlurl-status=yes dead|dfpublisher=IBM Systems Journal}}</ref><ref>{{citationCite web|date=2006-01-20|title=Compiler Technology for Scalable Architectures|url=https://arstechnicawww.empat.tech/services|archive-url=https://web.archive.org/web/20080320071448/http://domino.research.ibm.com/newscomm/research_projects.arsnsf/postpages/20060225-6265cellcompiler.index.html|archive-date=2008-03-20|access-date=2025-06-11|website=IBM Research|language=en-us}}</ref><ref>{{Cite web|last=Stokes|first=Jon|date=2006-02-26|title=IBM's Octopiler, or, why the PS3 is running late |publisherurl=ArsTechnicahttps://arstechnica.com/uncategorized/2006/02/6265-2/|access-date=2025-06-11|website=Ars Technica}}</ref>
 
==Software portability==
Line 40:
| Java, non-Java || single precision, IEEE double
|-
! [[memoryData (computerstructure science)alignment|memory]]Memory alignment]]
| quadword only || quadword only
|}
Line 52:
 
====Porting VMX code for SPU====
ThereA issubstantial a great bodyamount of codeVMX which([[Altivec]]) has beencode developed for other IBM [[IBM POWERPower microprocessors|Power processors]], that could potentially be adapted and recompiled to run on the SPU. This code base includes VMX code that runsparticularly under the [[PowerPC]] version of [[Apple Computer|Apple'smacOS]], [[Maccan OSpotentially X]],be whereadapted itfor isuse betteron knownthe as [[Altivec]]SPU. DependingThe onfeasibility howof manyporting VMXdepends specific features are involved,on the adaptationextent involvedof canVMX-specific rangefeatures anywhereused—ranging from straightforward, to onerous, to completely impractical. TheHowever, most importantkey workloads fortypically themap SPUwell generallyto mapthe quiteSPU wellarchitecture.
 
In some cases it is possible to port, existing VMX code can be ported directly. If the VMX code is highly generic (makes few assumptions about the execution environment) the translation can be relatively straightforward. The two processors specify a different [[binary format|binary code format]], so recompilation is required at a minimum. Even where [[Instruction (computer science)|instructions]] exist with the same behaviors, they do not have the same instruction names, so this must be mapped as well. IBM's providesdevelopment toolkit includes compiler [[intrinsicintrinsics function|intrinsic]]sthat whichautomate take caremuch of this mapping transparently as part of the development toolkit.
 
In many cases, however, a directly equivalent instruction does not exist. The workaround might be obvious or it might not. For example, if saturation behavior is required on the SPU, it can be coded by adding additional SPU instructions to accomplish this (with some loss of efficiency). At the other extreme, if Java floating-point semantics are required, this is almost impossible to achieve on the SPU processor. To achieve the same computation on the SPU might require that an entirely different [[algorithm]] be written from scratch.
Line 63:
Transferring data between the local stores of different SPUs can have a large performance cost. The local stores of individual SPUs can be exploited using a variety of strategies.
 
Applications with high locality, such as dense matrix computations, represent an ideal workload class for the local stores in Cell BE.<ref>{{cite web|url=http://www.research.ibm.com/people/m/mikeg/papers/2006_ieeemicro.pdf|format=PDF|title=Synergistic Processing in Cell's Multicore Architecture|date=March 2006}}</ref>
 
Streaming computations can be efficiently accommodated using [[software pipelining]] of memory block transfers using a multi-buffering strategy.<ref name="research.ibm.com"/>
 
The software cache offers a solution for random accesses.<ref>{{cite web|url=http://www.research.ibm.com/journal/sj/451/eichenberger.pdf|format=PDF|title=Using advanced compiler technology to exploit the performance of the Cell Broadband Engine architecture|date=January 2006}}</ref>
 
More sophisticated applications can use multiple strategies for different data types.<ref>{{cite web|url=http://www.research.ibm.com/cell/papers/2008_vee_cellgc.pdf|format=PDF|title=Cell GC: Using the Cell Synergistic Processor as a Garbage Collection Coprocessor |date=March 2008}}</ref>
 
===References===
* [http://www.research.ibm.com/cell/ The Cell Project at IBM Research]
* [http://cag.csail.mit.edu/crg/papers/eichenberger05cell.pdf Optimizing Compiler for a CELL Processor]