OneAPI (compute acceleration): Difference between revisions

Content deleted Content added
GoBuck76 (talk | contribs)
Implementations: Intel announced date of production release.
 
(81 intermediate revisions by 41 users not shown)
Line 1:
{{short description|Open standard for parallel computing}}
{{lowercase title}}
{{otheruses|OneAPI (disambiguation)}}
'''oneAPI'''<ref>{{Cite web|url=https://www.hpcwire.com/2019/12/09/intel-expands-its-silicon-portfolio-and-oneapi-software-initiative-for-next-generation-hpc/|title=Intel Expands its Silicon Portfolio, and oneAPI Software Initiative for Next-Generation HPC|date=2019-12-09|website=HPCwire|language=en-US|access-date=2020-02-11}}</ref><ref>{{Cite web|url=https://www.hpcwire.com/2019/11/17/intel-debuts-new-gpu-ponte-vecchio-and-outlines-aspirations-for-oneapi/|title=Intel Debuts New GPU – Ponte Vecchio – and Outlines Aspirations for oneAPI|date=2019-11-18|website=HPCwire|language=en-US|access-date=2020-02-11}}</ref><ref>{{Cite web|url=https://www.extremetech.com/computing/302284-sc19-intel-unveils-new-gpu-stack-oneapi-development-effort|title=SC19: Intel Unveils New GPU Stack, oneAPI Development Effort - ExtremeTech|website=www.extremetech.com|access-date=2020-02-11}}</ref><ref>{{Cite web|url=https://www.servethehome.com/intel-one-api-to-rule-them-all-is-much-needed/|title=Intel One API to Rule Them All Is Much Needed to Expand TAM|last=Kennedy|first=Patrick|date=2018-12-24|website=ServeTheHome|language=en-US|access-date=2020-02-11}}</ref> is a cross-industry initiative for an open, standards-based unified programming model that creates a common developer experience across compute accelerator architectures. Its objective is to deliver an efficient, performant programming model that eliminates the need for developers to maintain separate code bases, multiple programming languages, and different tools and workflows for each architecture.
{{Infobox software
| name = oneAPI
| logo = OneAPI-rgb-3000.png
| logo_caption =
| latest_release_date = {{start date and age|2020|11|13}}
| operating_system = [[Cross-platform]]
| platform = Cross-platform
| genre = [[Open-source software|Open-source]] [[Formal specification|software specification]] for parallel programming
| repo = {{URL|https://github.com/oneapi-src}}
| website = {{official URL}}
}}
 
'''oneAPI''' is an [[open standard]], adopted by Intel,{{sfn|Fortenberry|Tomov|2022|p=22}} for a unified [[application programming interface]] (API) intended to be used across different computing [[Hardware acceleration|accelerator]] ([[coprocessor]]) architectures, including [[GPU]]s, [[AI accelerator]]s and [[field-programmable gate array]]s. It is intended to eliminate the need for developers to maintain separate code bases, multiple programming languages, tools, and workflows for each architecture.<ref>{{Cite web|url=https://www.hpcwire.com/2019/12/09/intel-expands-its-silicon-portfolio-and-oneapi-software-initiative-for-next-generation-hpc/|title=Intel Expands its Silicon Portfolio, and oneAPI Software Initiative for Next-Generation HPC|date=2019-12-09|website=HPCwire|language=en-US|access-date=2020-02-11}}</ref><ref>{{Cite web|url=https://www.hpcwire.com/2019/11/17/intel-debuts-new-gpu-ponte-vecchio-and-outlines-aspirations-for-oneapi/|title=Intel Debuts New GPU – Ponte Vecchio – and Outlines Aspirations for oneAPI|date=2019-11-18|website=HPCwire|language=en-US|access-date=2020-02-11}}</ref><ref>{{Cite web|url=https://www.extremetech.com/computing/302284-sc19-intel-unveils-new-gpu-stack-oneapi-development-effort|title=SC19: Intel Unveils New GPU Stack, oneAPI Development Effort - ExtremeTech|website=www.extremetech.com|access-date=2020-02-11}}</ref><ref>{{Cite web|url=https://www.servethehome.com/intel-one-api-to-rule-them-all-is-much-needed/|title=Intel One API to Rule Them All Is Much Needed to Expand TAM|last=Kennedy|first=Patrick|date=2018-12-24|website=ServeTheHome|language=en-US|access-date=2020-02-11}}</ref> is a cross-industry initiative for an open, standards-based unified programming model that creates a common developer experience across compute accelerator architectures. Its objective is to deliver an efficient, performant programming model that eliminates the need for developers to maintain separate code bases, multiple programming languages, and different tools and workflows for each architecture.
== The oneAPI Specification ==
The oneAPI specification<ref>{{Cite web|url=https://spec.oneapi.com/oneAPI/|title=The oneAPI Specification|last=|first=|date=|website=oneAPI|url-status=live|archive-url=|archive-date=|access-date=}}</ref> extends existing developer programming models to enable multiple hardware architectures through a data-parallel language, a set of library APIs, and a low-level hardware interface to support cross-architecture programming. It builds upon industry standards and provides an open, cross-platform developer stack.
 
oneAPI competes with other GPU computing stacks: [[CUDA]] by [[Nvidia]] and [[ROCm]] by [[AMD]].
== The Language – Data Parallel C++ ==
DPC++<ref>{{Cite web|url=https://www.apress.com/gp/data-parallel-c-advanced-chapters-just-released/17382670|title=Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems Using C++ and SYCL|last=|first=|date=|website=Apress|url-status=live|archive-url=|archive-date=|access-date=}}</ref><ref>{{Cite web|url=https://insidebigdata.com/2019/12/16/heterogeneous-computing-programming-oneapi-and-data-parallel-c/|title=Heterogeneous Computing Programming: oneAPI and Data Parallel C++|last=Team|first=Editorial|date=2019-12-16|website=insideBIGDATA|language=en-US|access-date=2020-02-11}}</ref> is an open, cross-architecture language built upon the [[ISO C++]] and [[Khronos Group]] [[SYCL]] standards<ref>{{Cite web|url=https://www.khronos.org/news/permalink/intels-one-api-project-incorporates-sycl|title=The Khronos Group|date=2020-02-11|website=The Khronos Group|language=en|access-date=2020-02-11}}</ref>. DPC++ extends these standards with explicit parallel constructs like sub-groups and unified shared memory offload interfaces to support a broad range of computing architectures and processors, including [[CPU]]s and accelerators. Extensions are contributed back to standards bodies. An example of this is the contribution of unified shared memory, group algorithms and sub-groups to SYCL 2020.<ref>{{Cite web|date=2020-06-30|title=Khronos Steps Towards Widespread Deployment of SYCL with Release of SYCL 2020 Provisional Specification|url=https://www.khronos.org/news/press/khronos-releases-sycl-2020-provisional-specification|access-date=2020-07-06|website=The Khronos Group|language=en}}</ref><ref>{{Cite web|last=staff|date=2020-06-30|title=New, Open DPC++ Extensions Complement SYCL and C++|url=https://insidehpc.com/2020/06/new-open-dpc-extensions-complement-sycl-and-c/|access-date=2020-07-06|website=insideHPC|language=en-US}}</ref>
 
== The oneAPI LibrariesSpecification ==
The oneAPI specification extends existing developer programming models to enable multiple hardware architectures through a data-parallel language, a set of library APIs, and a low-level hardware interface to support cross-architecture programming. It builds upon industry standards and provides an open, cross-platform developer stack.<ref name="spec">{{cite web |url=https://www.oneapi.io/spec/ |title=oneAPI Specification |last= |first= |date= |website=oneAPI |archive-url= |archive-date= |access-date=}}</ref><ref>{{Cite web|date=2021-03-23|title=Preparing for the Arrival of Intel's Discrete High-Performance GPUs|url=https://www.hpcwire.com/2021/03/23/preparing-for-the-arrival-of-intels-discrete-high-performance-gpus/|access-date=2021-03-29|website=HPCwire|language=en-US}}</ref>
The set of APIs<ref>{{Cite web|url=https://www.oneapi.com/spec/|title=oneAPI specification elements|last=|first=|date=|website=oneAPI|url-status=live|archive-url=|archive-date=|access-date=}}</ref> spans several domains that benefit from acceleration, including an interface for deep learning; general libraries for linear algebra math, video, and media processing; and others.
 
== The Language – Data Parallel C++ ==
[[Intel_C%2B%2B_Compiler|DPC++]]<ref>{{Cite web|url=https://www.apress.com/gp/data-parallel-c-advanced-chapters-just-released/17382670|title=Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems Using C++ and SYCL|last=|first=|date=|website=Apress|url-status=live|archive-url=|archive-date=|access-date=}}</ref><ref>{{Cite web|url=https://insidebigdata.com/2019/12/16/heterogeneous-computing-programming-oneapi-and-data-parallel-c/|title=Heterogeneous Computing Programming: oneAPI and Data Parallel C++|last=Team|first=Editorial|date=2019-12-16|website=insideBIGDATA|language=en-US|access-date=2020-02-11}}</ref> is ana open, cross-architectureprogramming language implementation of oneAPI, built upon the [[ISO C++]] and [[Khronos Group]] [[SYCL]] standards.<ref>{{Cite web|url=https://www.khronos.org/news/permalink/intels-one-api-project-incorporates-sycl|title=The Khronos Group|date=2020-02-11|website=The Khronos Group|language=en|access-date=2020-02-11}}</ref>. DPC++ extendsis thesean standards with explicit parallel constructs like sub-groups and unified shared memory offload interfaces to support a broad rangeimplementation of computingSYCL architectureswith andextensions processors, including [[CPU]]s and accelerators. Extensionsthat are contributedproposed backfor to standards bodies.inclusion in Anfuture examplerevisions of thisthe isSYCL thestandard, contribution ofincluding: unified shared memory, group algorithms, and sub-groups to SYCL 2020.<ref>{{Cite web|date=2020-06-30|title=Khronos Steps Towards Widespread Deployment of SYCL with Release of SYCL 2020 Provisional Specification|url=https://www.khronos.org/news/press/khronos-releases-sycl-2020-provisional-specification|access-date=2020-07-06|website=The Khronos Group|language=en}}</ref><ref>{{Cite web|last=staff|date=2020-06-30|title=New, Open DPC++ Extensions Complement SYCL and C++|url=https://insidehpc.com/2020/06/new-open-dpc-extensions-complement-sycl-and-c/|access-date=2020-07-06|website=insideHPC|language=en-US}}</ref><ref>{{Cite web|date=2021-02-09|title=SYCL 2020 Launches with New Name, New Features, and High Ambition|url=https://www.hpcwire.com/2021/02/09/sycl-2020-launches-new-name-new-features/|access-date=2021-02-16|website=HPCwire|language=en-US}}</ref>
 
== Libraries ==
The set of APIs<ref name="spec" /> spans several domains, including libraries for linear algebra, deep learning, machine learning, video processing, and others.
{| class="wikitable"
!'''Library Name'''
Line 20 ⟶ 36:
|Algorithms and functions to speed DPC++ kernel programming
|-
|[[Math Kernel Library|oneAPI Math Kernel Library]]
|oneMKL
|Math routines including matrix algebra, FFT, and vector math
|-
|[[Data Analytics Library|oneAPI Data Analytics Library]]
|oneDAL
|Machine learning and data analytics functions
Line 36 ⟶ 52:
|Communication patterns for distributed deep learning
|-
|[[Threading Building Blocks|oneAPI Threading Building Blocks]]
|oneTBB
|Threading and memory management template library
Line 45 ⟶ 61:
|}
 
The [[source code]] of parts of the above libraries is available on GitHub.<ref>{{cite web |title=oneAPI-SRC |url=https://github.com/oneapi-src |website=GitHub |language=en}}</ref>
== The Hardware Abstraction Layer ==
 
oneAPI Level Zero<ref>{{Cite web|url=https://www.tomshardware.com/news/intel-releases-bare-metal-oneapi-level-zero-specification|title=Intel Releases Bare-Metal oneAPI Level Zero Specification|last=Verheyde 2019-12-08T16:11:19Z|first=Arne|website=Tom's Hardware|language=en|access-date=2020-02-11}}</ref><ref>{{Cite web|url=https://www.phoronix.com/scan.php?page=news_item&px=Intel-oneAPI-Level-Zero|title=Intel's Compute Runtime Adds oneAPI Level Zero Support - Phoronix|website=www.phoronix.com|access-date=2020-03-10}}</ref><ref>{{Cite web|url=https://www.phoronix.com/scan.php?page=article&item=intel-level-zero&num=1|title=Initial Benchmarks With Intel oneAPI Level Zero Performance - Phoronix|website=www.phoronix.com|access-date=2020-04-13}}</ref>, the low-level hardware interface, defines a set of capabilities and services that a hardware accelerator needs to interface with compiler runtimes and other developer tools.
The oneAPI documentation also lists the "Level Zero" API defining the low-level direct-to-metal interfaces and a set of [[Ray tracing (graphics)|ray tracing]] components with its own APIs.<ref name="spec" />
 
== The Hardware Abstractionabstraction Layerlayer ==
oneAPI Level Zero,<ref>{{Cite web|url=https://www.tomshardware.com/news/intel-releases-bare-metal-oneapi-level-zero-specification|title=Intel Releases Bare-Metal oneAPI Level Zero Specification|last=Verheyde 2019-12-08T16:11:19Z|first=Arne|website=Tom's Hardware|date=8 December 2019 |language=en|access-date=2020-02-11}}</ref><ref>{{Cite web|url=https://www.phoronix.com/scan.php?page=news_item&px=Intel-oneAPI-Level-Zero|title=Intel's Compute Runtime Adds oneAPI Level Zero Support - Phoronix|website=www.phoronix.com|access-date=2020-03-10}}</ref><ref>{{Cite web|url=https://www.phoronix.com/scan.php?page=article&item=intel-level-zero&num=1|title=Initial Benchmarks With Intel oneAPI Level Zero Performance - Phoronix|website=www.phoronix.com|access-date=2020-04-13}}</ref>, the low-level hardware interface, defines a set of capabilities and services that a hardware accelerator needs to interface with compiler runtimes and other developer tools.
 
== Implementations ==
[[Intel]] has released a oneAPI Betaproduction Product<ref>{{Cite web|url=https://software.intel.com/en-us/oneapi|title=Intel oneAPI Product|last=|first=|date=|website=Intel oneAPI Toolkits|url-status=live|archive-url=|archive-date=|access-date=}}</ref><ref>{{Cite web|url=https://www.colfax-intl.com/training/intel-oneapi-training|title=oneAPI Training|website=Colfax International|access-date=2020-02-11}}</ref>toolkits that implementsimplement the specification and addsadd CUDA code migration, analysis, and debug tools. The production release will ship in Dec. 2020.<ref>{{Cite news|date=2020-11-11|title=Intel Champions XPU Vision With oneAPI, Data Center GPUs - SDxCentral|language=en-US|work=SDxCentral|url=https://www.sdxcentral.com/articles/news/intel-champions-xpu-vision-with-oneapi-data-center-gpus/2020/11/|access-date=2020-11-11}}</ref><ref>{{Cite web|date=2020-11-11|title=Intel Debuts oneAPI Gold and Provides More Details on GPU Roadmap|url=https://www.hpcwire.com/2020/11/11/intel-debuts-oneapi-gold-and-provides-more-details-on-gpu-roadmap/|access-date=2020-11-11|website=HPCwire|language=en-US}}</ref><ref>{{Cite web|last=Moorhead|first=Patrick|title=Intel Announces Gold Release Of OneAPI Toolkits And New Intel Server GPU|url=https://www.forbes.com/sites/patrickmoorhead/2020/12/02/intel-announces-gold-release-of-oneapi-toolkits-and-new-intel-server-gpu/|access-date=2020-12-08|website=Forbes|language=en}}</ref> These include the [[Intel C++ compiler|Intel oneAPI DPC++/C++ Compiler]],<ref>{{Cite web|title=Data Parallel C++ for Cross-Architecture Applications|url=https://www.intel.com/content/www/us/en/develop/tools/oneapi/components/dpc-compiler.html|access-date=2021-10-07|website=Intel|language=en}}</ref> [[Intel Fortran Compiler]], [[VTune|Intel VTune]] Profiler<ref>{{Cite web|title=Fix Performance Bottlenecks with Intel® VTune™ Profiler|url=https://www.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html|access-date=2021-10-07|website=Intel|language=en}}</ref> and multiple performance libraries.
 
[[Codeplay]] has released an open-source layer<ref>{{Cite web|url=https://www.hpcwire.com/2020/02/04/codeplay-open-sources-a-version-of-computecpp-for-nvidia-gpus/|title=Codeplay Open Sources a Version of DPC++ for Nvidia GPUs|date=2020-02-05|website=HPCwire|language=en-US|access-date=2020-02-12}}</ref><ref>{{Cite web|url=https://www.phoronix.com/scan.php?page=news_item&px=Intel-oneAPI-DPC-SYCL-NVIDIA-CU|title=Intel's oneAPI / DPC++ / SYCL Will Run Atop NVIDIA GPUs With Open-Source Layer - Phoronix|website=www.phoronix.com|access-date=2019-12-06}}</ref><ref>{{Cite web|url=https://www.codeplay.com/portal/02-03-20-codeplay-contribution-to-dpcpp-brings-sycl-support-for-nvidia-gpus|title=Codeplay - Codeplay contribution to DPC++ brings SYCL support for NVIDIA GPUs|website=www.codeplay.com|access-date=2020-02-11}}</ref> to allow oneAPI and [[SYCL]] |SYCL/ Data Parallel CDPC++]] to run atop [[Nvidia]] [[GPU]]s via [[CUDA]].
 
[[Heidelberg University|University of Heidelberg]] has developed a SYCL/DPC++ implementation for both AMD and Nvidia GPUs.<ref>{{Cite web|last=Salter|first=Jim|date=2020-09-30|title=Intel, Heidelberg University team up to bring Radeon GPU support to AI|url=https://arstechnica.com/gadgets/2020/09/intel-heidelberg-university-team-up-to-bring-radeon-gpu-support-to-ai/|access-date=2021-10-07|website=Ars Technica|language=en-us}}</ref>
 
[[Huawei]] released a DPC++ compiler for their Ascend AI Chipset<ref>{{Citation|title=Extending DPC++ with Support for Huawei Ascend AI Chipset| date=27 April 2021 |url=https://www.youtube.com/watch?v=7foee4_QkbU|language=en|access-date=2021-10-07}}</ref>
 
[[Fujitsu]] has created an open-source [[ARM architecture|ARM]] version of the oneAPI Deep Neural Network Library (oneDNN)<ref>{{Cite web|last=fltech|date= 19 November 2020|title=A Deep Dive into a Deep Learning Library for the A64FX Fugaku CPU - The Development Story in the Developer's Own Words|url=https://blog.fltech.dev/entry/2020/11/19/fugaku-onednn-deep-dive-en|access-date=2021-02-10|website=fltech - 富士通研究所の技術ブログ|language=ja}}</ref> for their [[Fugaku (supercomputer)|Fugaku CPU]].
 
== Unified Acceleration Foundation (UXL) and the future for oneAPI{{anchor|UXL}} ==
 
Unified Acceleration Foundation (UXL) is a new technology consortium that are working on the continuation of the OneAPI initiative, with the goal to create a new open standard accelerator software ecosystem, related open standards and specification projects through Working Groups and Special Interest Groups (SIGs). The goal will compete with Nvidia's CUDA. The main companies behind it are Intel, Google, ARM, Qualcomm, Samsung, Imagination, and VMware.<ref>{{Cite web |title=Exclusive: Behind the plot to break Nvidia's grip on AI by targeting software |website=[[Reuters]] |url=https://www.reuters.com/technology/behind-plot-break-nvidias-grip-ai-by-targeting-software-2024-03-25/ |access-date=2024-04-05}}</ref>
 
==References==
{{reflist}}
<references />
 
== External linksSources ==
* {{cite conference |url= https://icl.utk.edu/files/publications/2022/icl-utk-1616-2022.pdf |title=Extending MAGMA Portability with OneAPI |last1=Fortenberry |first1=Anna |last2=Tomov |first2=Stanimire |date=2022 |publisher=[[IEEE]] |book-title= |pages=22–31 |___location= |conference=2022 Workshop on Accelerator Programming Using Directives (WACCPD) |id=}} .
 
== External links ==
* [https://www.oneapi.com/ oneAPI Industry Specification]
* {{official website}}
* [https://software.intel.com/en-us/oneapi Intel oneAPI Product]
* {{GitHub|oneapi-src|oneAPI}}
* [https://www.codeplay.com/portal/12-16-19-bringing-nvidia-gpu-support-to-sycl-developers Bringing Nvidia GPU support to SYCL developers]
* {{cite book |display-authors= 1 |first1= James |last1= Reinders |first2= Ben |last2= Ashbaugh |first3= James |last3= Brodman |first4= Michael |last4= Kinsner |first5= John |last5= Pennycook |first6= Xinmin |last6= Tian |url= https://link.springer.com/book/10.1007/978-1-4842-5574-2 |title= Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL |publisher= Springer |isbn= 978-1-4842-5574-2 |doi= 10.1007/978-1-4842-5574-2 |series= Open Access Book |year= 2021 |s2cid= 226231933 }}
* [https://developer.codeplay.com/products/oneapi/nvidia/2025.1.0/guides/index oneAPI for NVIDIA GPUs 2025.1.0]
* [https://developer.codeplay.com/products/oneapi/amd/2025.1.0/guides/index oneAPI for AMD GPUs 2025.1.0]
 
[[Category:Application programming interfaces]]
[[Category:Cross-platform software]]
[[Category:Intel software]]