Heterogeneous System Architecture: Difference between revisions

Content deleted Content added
m {{Anchor|AMDKFD|HQ|HMM}}Software support: clean up; HTTP→HTTPS using AWB
Overview: structure better which also allows redirect to section
Line 33:
Originally introduced by [[embedded system]]s such as the [[Cell Broadband Engine]], sharing system memory directly between multiple system actors makes heterogeneous computing more mainstream. Heterogeneous computing itself refers to systems that contain multiple processing units{{snd}} [[central processing unit]]s (CPUs), [[graphics processing unit]]s (GPUs), [[digital signal processor]]s (DSPs), or any type of [[application-specific integrated circuit]]s (ASICs). The system architecture allows any accelerator, for instance a [[GPU|graphics processor]], to operate at the same processing level as the system's CPU.
 
Among its main features, HSA defines a unified [[virtual address space]] for compute devices: where GPUs traditionally have their own memory, separate from the main (CPU) memory, HSA requires these devices to share [[Pagepage (computer memory)|page tables]] so that devices can exchange data by sharing [[Pointerpointer (computer programming)|pointers]]. This is to be supported by custom [[memory management unit]]s.<ref name="whitepaper"/>{{rp|6–7}} To render interoperability possible and also to ease various aspects of programming, HSA is intended to be [[Instructioninstruction set|ISA]]-agnostic for both CPUs and accelerators, and to support high-level programming languages.
 
So far, the HSA specifications cover:
 
* HSA Intermediate Layer (HSAIL), a [[p-code machine|virtual instruction set]] for parallel programs
===HSA Intermediate Layer===<!--incoming redirect-->
** similar{{according to whom|date=May 2015}} to [[LLVM Intermediate Representation]] and [[Standard Portable Intermediate Representation|SPIR]] (used by [[OpenCL]] and [[Vulkan (API)|Vulkan]])
* HSA Intermediate Layer (HSAIL), a [[p-code machine|virtual instruction set]] for parallel programs
** finalized to a specific instruction set by a [[Just-in-time compilation|JIT compiler]]
** similar{{according to whom|date=May 2015}} to [[LLVM Intermediate Representation]] and [[Standard Portable Intermediate Representation|SPIR]] (used by [[OpenCL]] and [[Vulkan (API)|Vulkan]])
** make late decisions on which core(s) should run a task
** finalized to a specific instruction set by a [[Just-in-time compilation|JIT compiler]]
** explicitly parallel
** make late decisions on which core(s) should run a task
** supports exceptions, virtual functions and other high-level features
** explicitly parallel
** syscall methods (I/O, [[printf]],{{clarify|reason=printf is not a syscall on any operating system that I know|date=May 2015}} etc.)
** supports exceptions, virtual functions and other high-level features
** debugging support
** syscall methods (I/O, [[printf]],{{clarify|reason=printf is not a syscall on any operating system that I know|date=May 2015}} etc.)
* HSA memory model
** debugging support
** compatible with [[C++11]], OpenCL, [[Java (programming language)|Java]] and [[.NET Framework|.NET]] memory models
 
** relaxed consistency
* ===HSA memory model===
** designed to support both managed languages (e.g. Java) and unmanaged languages (e.g. [[C (programming language)|C]])
** compatible with [[C++11]], OpenCL, [[Java (programming language)|Java]] and [[.NET Framework|.NET]] memory models
** will make it much easier to develop 3rd-party compilers for a wide range of heterogeneous products programmed in [[Fortran]], C++, [[C++ AMP]], Java, et al.
** relaxed consistency
* HSA dispatcher and run-time
** designed to support both managed languages (e.g. Java) and unmanaged languages (e.g. [[C (programming language)|C]])
** designed to enable heterogeneous task queueing: a work queue per core, distribution of work into queues, load balancing by work stealing
** will make it much easier to develop 3rd-party compilers for a wide range of heterogeneous products programmed in [[Fortran]], C++, [[C++ AMP]], Java, et al.
** any core can schedule work for any other, including itself
 
** significant reduction of overhead of scheduling work for a core
* ===HSA dispatcher and run-time===
** designed to enable heterogeneous task queueing: a work queue per core, distribution of work into queues, load balancing by work stealing
** any core can schedule work for any other, including itself
** significant reduction of overhead of scheduling work for a core
 
Mobile devices are one of the HSA's application areas, in which it yields improved power efficiency.<ref name="gpuscience" />