Multidimensional DSP with GPU acceleration: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 17:00, 13 May 2024 edit WOSlinker (talk \| contribs) Administrators 861,415 edits m remove href oddities ← Previous edit		Latest revision as of 23:40, 26 August 2025 edit undo Citation bot (talk \| contribs) Bots 5,863,371 edits Added bibcode. Removed URL that duplicated identifier. Removed parameters. \| Use this bot. Report bugs. \| Suggested by Headbomb \| Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox \| #UCB_webform_linked 825/990
(One intermediate revision by one other user not shown)
Line 2: Multidimensional Digital Signal Processing (MDSP) refers to the extension of [[Digital signal processing]] (DSP) techniques to signals that vary in more than one dimension. While conventional DSP typically deals with one-dimensional data, such as time-varying [[Audio signal\|audio signals]], MDSP involves processing signals in two or more dimensions. Many of the principles from one-dimensional DSP, such as [[Fourier transform\|Fourier transforms]] and [[filter design]], have analogous counterparts in multidimensional signal processing. Modern [[general-purpose computing on graphics processing units]] (GPGPUs) have an excellent throughput on vector operations and numeric manipulations through a high degree of parallel computations. Processing digital signals, particularly multidimensional signals, often involves a series of vector operations on massive numbers of independent data samples, GPGPUs are now widely employed to accelerate multidimensional DSP, such as [[image processing]], [[Video processing\|video codecs]], [[Radar signal characteristics\|radar signal analysis]], [[sonar signal processing]], and [[ultrasound scan]]ning. Conceptually, GPGPUs dramatically reduce the computation complexity when compared with [[~~Cpu\|~~central processing ~~units~~unit]]s (CPUs)]], [[~~Digital~~digital signal processor]]s (DSPs), or other [[Field-programmable gate array\|FPGA]] accelerators. ==Motivation== Line 25: [[File:SIMD GPGPU.jpg\|alt= Figure illustrating a SIMD/vector computation unit in GPGPUs..\|thumb\|GPGPU/SIMD computation model]] Modern GPU designs are mainly based on the [[Single instruction, multiple data\|SIMD]] (Single Instruction Multiple Data) computation paradigm.<ref>{{cite journal\|title=NVIDIA Tesla: A Unified Graphics and Computing Architecture\|journal=IEEE Micro\|date=2008-03-01\|issn=0272-1732\|pages=39–55\|volume=28\|issue=2\|doi=10.1109/MM.2008.31\|first1=E.\|last1=Lindholm\|first2=J.\|last2=Nickolls\|first3=S.\|last3=Oberman\|first4=J.\|last4=Montrym\|~~s2cid~~bibcode=~~2793450\|url=https://ieeexplore~~2008IMicr.~~ieee~~.~~org/document/4523358~~28b..39L \|~~url-access~~s2cid=~~registration~~2793450\|language=en}}</ref><ref>{{cite book\|title=Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU)\|last1=Kim\|first1=Hyesoon\|author1-link=Hyesoon Kim\|publisher=Morgan & Claypool Publishers\|year=2012\|isbn=978-1-60845-954-4\|last2=Vuduc\|first2=Richard\|last3=Baghsorkhi\|first3=Sara\|last4=Choi\|first4=Jee\|last5=Hwu\|first5=Wen-Mei W.\|editor-last=Hill\|editor-first=Mark D.\|doi=10.2200/S00451ED1V01Y201209CAC020\|language=en}}</ref> This type of GPU devices is so-called [[General-purpose computing on graphics processing units\|general-purpose GPUs (GPGPUs)]]. GPGPUs are able to perform an operation on multiple independent data concurrently with their vector or SIMD functional units. A modern GPGPU can spawn thousands of concurrent threads and process all threads in a batch manner. With this nature, GPGPUs can be employed as DSP accelerators easily while many DSP problems can be solved by [[Divide and conquer algorithms\|divide-and-conquer]] algorithms. A large scale and complex DSP problem can be divided into a bunch of small numeric problems and be processed altogether at one time so that the overall time complexity can be reduced significantly. For example, multiplying two {{math\|''M'' × ''M''}} matrices can be processed by {{math\|''M'' × ''M''}} concurrent threads on a GPGPU device without any output data dependency. Therefore, theoretically, by means of GPGPU acceleration, we can gain up to {{math\|''M'' × ''M''}} speedup compared with a traditional CPU or digital signal processor. Line 182: ===Radar signal reconstruction and analysis=== Radar systems usually need to reconstruct numerous 3-D or 4-D data samples in real-time. Traditionally, particularly in military, this needs supercomputers' support. Nowadays, GPGPUs are also employed to replace supercomputers to process radar signals. For example, to process [[Synthetic aperture radar\|synthetic aperture radar (SAR)]] signals, it usually involves multidimensional [[Fast Fourier transform\|FFT]] computations.<ref>{{cite book\|date=2009-10-01\|pages=309–314\|doi=10.1109/SIPS.2009.5336272\|first1=C.\|last1=Clemente\|first2=M.\|last2=Di Bisceglie\|first3=M.\|last3=Di Santo\|first4=N.\|last4=Ranaldo\|first5=M.\|last5=Spinelli\|title=2009 IEEE Workshop on Signal Processing Systems\|chapter=Processing of synthetic Aperture Radar data with GPGPU\|isbn=978-1-4244-4335-2\|s2cid=18932083~~\|url=https://ieeexplore.ieee.org/document/5336272\|url-access=limited~~\|language=en}}</ref><ref>{{cite book\|date=2009-10-01\|pages=1–5\|doi=10.1109/CISP.2009.5304418\|first1=Bin\|last1=Liu\|first2=Kaizhi\|last2=Wang\|first3=Xingzhao\|last3=Liu\|first4=Wenxian\|last4=Yu\|title=2009 2nd International Congress on Image and Signal Processing\|chapter=An Efficient SAR Processor Based on GPU via CUDA\|isbn=978-1-4244-4129-7\|s2cid=18801932}}</ref><ref>{{cite book\|date=2014-06-01\|pages=455–458\|doi=10.1109/MIXDES.2014.6872240\|first1=P.\|last1=Monsurro\|first2=A.\|last2=Trifiletti\|first3=F.\|last3=Lannutti\|title=2014 Proceedings of the 21st International Conference Mixed Design of Integrated Circuits and Systems (MIXDES)\|chapter=Implementing radar algorithms on CUDA hardware\|isbn=978-83-63578-05-3\|s2cid=16482715}}</ref> GPGPUs can be used to rapidly perform FFT and/or iFFT in this kind of applications. ===Self-driving cars===