Content deleted Content added
Artoria2e5 (talk | contribs) →References: ~ |
simplify link. WP:DUPLINK. format endmatter. |
||
Line 2:
{{Use American English|date=January 2019}}
{{Use mdy dates|date=October 2021}}
A '''neural processing unit''' ('''NPU'''), also known as '''AI accelerator''' or '''deep learning processor
==Use==
Line 8 ⟶ 9:
=== Consumer devices ===
AI accelerators are used in mobile devices such as Apple [[iPhone]]s, AMD [[AI engine|AI engines]]<ref>{{Cite journal |last=Brown |first=Nick |date=2023-02-12 |title=Exploring the Versal AI Engines for Accelerating Stencil-based Atmospheric Advection Simulation |url=https://dl.acm.org/doi/10.1145/3543622.3573047 |journal=Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays |series=FPGA '23 |___location=New York, NY, USA |publisher=Association for Computing Machinery |pages=91–97 |doi=10.1145/3543622.3573047 |isbn=978-1-4503-9417-8|arxiv=2301.13016 }}</ref> in Versal and NPUs, [[Huawei]], and [[Google Pixel]] smartphones,<ref>{{Cite web|url=https://consumer.huawei.com/en/press/news/2017/ifa2017-kirin970|title=HUAWEI Reveals the Future of Mobile AI at IFA}}</ref> and seen in many [[Apple silicon
It is more recently (circa 2022) added to computer processors from [[Intel]],<ref>{{Cite web|url=https://www.intel.com/content/www/us/en/newsroom/news/intels-lunar-lake-processors-arriving-q3-2024.html|title=Intel's Lunar Lake Processors Arriving Q3 2024|website=Intel|date=May 20, 2024 }}</ref> [[AMD]],<ref>{{cite web|title=AMD XDNA Architecture|url=https://www.amd.com/en/technologies/xdna.html}}</ref> and
On consumer devices, the NPU is intended to be small, power-efficient, but reasonably fast when used to run small models. To do this they are designed to support low-bitwidth operations using data types such as INT4, INT8, FP8, and FP16. A common metric is trillions of operations per second (TOPS), though this metric alone does not quantify which kind of operations are being done.<ref>{{cite web |title=A guide to AI TOPS and NPU performance metrics |url=https://www.qualcomm.com/news/onq/2024/04/a-guide-to-ai-tops-and-npu-performance-metrics}}</ref>
Line 22 ⟶ 23:
Mobile NPU vendors typically provide their own [[application programming interface]] such as the Snapdragon Neural Processing Engine. An operating system or a higher-level library may provide a more generic interface such as TensorFlow Lite with LiteRT Next (Android) or CoreML (iOS, macOS).
Consumer CPU-integrated NPUs are accessible through vendor-specific APIs. AMD (Ryzen AI), Intel (OpenVINO), Apple
GPUs generally use existing [[GPGPU]] pipelines such as CUDA and OpenCL adapted for lower precisions. Custom-built systems such as the Google TPU use private interfaces.
==
{{notelist}}
{{Reflist|32em}}▼
== References ==
== External links ==
|