Approximate computing: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: doi updated in citation with #oabot.
Citation bot (talk | contribs)
Added bibcode. Removed URL that duplicated identifier. Removed parameters. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 163/1032
 
(7 intermediate revisions by 5 users not shown)
Line 8:
 
; Approximate circuits
: Approximate arithmetic circuits:<ref>Jiang et al., "[http://www.ece.ualberta.ca/~jhan8/publications/survey.pdf Approximate Arithmetic Circuits: A Survey, Characterization, and Recent Applications]", the Proceedings of the IEEE, Vol. 108, No. 12, pp. 2108 - 2135, 2020.</ref> [[Adder (electronics)|adders]],<ref>J. Echavarria, et al. "FAU: Fast and Error-Optimized Approximate Adder Units on LUT-Based FPGAs", FPT, 2016.</ref><ref>J. Miao, et al. "[https://repositories.lib.utexas.edu/bitstream/handle/2152/19706/miao_thesis_201291.pdf?sequence=1 Modeling and synthesis of quality-energy optimal approximate adders]", ICCAD, 2012</ref> [[Binary multiplier|multipliers]]<ref>{{Cite book|last1=Rehman|first1=Semeen|last2=El-Harouni|first2=Walaa|last3=Shafique|first3=Muhammad|last4=Kumar|first4=Akash|last5=Henkel|first5=Jörg|date=2016-11-07|title=Architectural-space exploration of approximate multipliers|publisher=ACM|pages=80|doi=10.1145/2966986.2967005|isbn=9781450344661|s2cid=5326133}}</ref> and other [[logical circuit]]s can reduce hardware overhead.<ref>S. Venkataramani, et al. "[http://algos.inesc-id.pt/projectos/approx/FCT/Ref-15.pdf SALSA: systematic logic synthesis of approximate circuits]", DAC, 2012.</ref><ref>J. Miao, et al. "[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.453.7870&rep=rep1&type=pdf Approximate logic synthesis under general error magnitude and frequency constraints]", ICCAD, 2013</ref><ref>R. Hegde et al. "[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.8929&rep=rep1&type=pdf Energy-efficient signal processing via algorithmic noise-tolerance]", ISLPED, 1999.</ref> For example, an approximate multi-bit adder can ignore the [[carry chain]] and thus, allow all its sub-adders to perform addition operation in parallel.<ref>{{Cite journal |lastlast1=Camus |firstfirst1=Vincent |last2=Mei |first2=Linyan |last3=Enz |first3=Christian |last4=Verhelst |first4=Marian |date=December 2019 |title=Review and Benchmarking of Precision-Scalable Multiply-Accumulate Unit Architectures for Embedded Neural-Network Processing |url=https://ieeexplore.ieee.org/document/8887521/ |journal=IEEE Journal on Emerging and Selected Topics in Circuits and Systems |volume=9 |issue=4 |pages=697–711 |doi=10.1109/JETCAS.2019.2950386 |issn=2156-3357 |quote="The implementation chosen in this study assumes a rightshifting sequential multiplier as it requires a smaller firststage adder than a left-shifting design, preventing long carry propagation and sign-bit extension."|doi-access=free |bibcode=2019IJEST...9..697C }}</ref><ref>{{Cite journal |lastlast1=Nagornov |firstfirst1=Nikolay N. |last2=Lyakhov |first2=Pavel A. |last3=Bergerman |first3=Maxim V. |last4=Kalita |first4=Diana I. |date=2024 |title=Modern Trends in Improving the Technical Characteristics of Devices and Systems for Digital Image Processing |url=https://ieeexplore.ieee.org/document/10478537/ |journal=IEEE Access |volume=12 |pages=44659–44681 |doi=10.1109/ACCESS.2024.3381493 |issn=2169-3536 |quote="Addition and accumulation of high order bits are not performed until the partial product reduction for the next multiplication in the proposed architecture."|doi-access=free |bibcode=2024IEEEA..1244659N }}</ref>
; Approximate storage and memory
: Instead of [[Computer data storage|storing data]] values exactly, they can be stored approximately, e.g., by [[Data truncation|truncating]] the lower-bits in [[floating point]] data. Another method is to accept less reliable memory. For this, in [[DRAM]]<ref>{{Cite journal|last1=Raha|first1=A.|last2=Sutar|first2=S.|last3=Jayakumar|first3=H.|last4=Raghunathan|first4=V.|date=July 2017|title=Quality Configurable Approximate DRAM|journal=IEEE Transactions on Computers|volume=66|issue=7|pages=1172–1187|doi=10.1109/TC.2016.2640296|bibcode=2017ITCmp..66.1172R |issn=0018-9340|doi-access=free}}</ref> and [[eDRAM]], [[refresh rate]] assignments can be lowered or controlled.<ref>{{Cite book|last1=Kim|first1=Yongjune|last2=Choi|first2=Won Ho|last3=Guyot|first3=Cyril|last4=Cassuto|first4=Yuval|title=2019 IEEE Global Communications Conference (GLOBECOM) |chapter=On the Optimal Refresh Power Allocation for Energy-Efficient Memories |date=December 2019|chapter-url=https://ieeexplore.ieee.org/document/9013465|___location=Waikoloa, HI, USA|publisher=IEEE|pages=1–6|doi=10.1109/GLOBECOM38437.2019.9013465|isbn=978-1-7281-0962-6|arxiv=1907.01112|s2cid=195776538}}</ref> In [[Static random-access memory|SRAM]], supply voltage can be lowered<ref>{{Cite journal|last1=Frustaci|first1=Fabio|last2=Blaauw|first2=David|author2-link=David Blaauw|last3=Sylvester|first3=Dennis|last4=Alioto|first4=Massimo|date=June 2016|title=Approximate SRAMs With Dynamic Energy-Quality Management|url=https://ieeexplore.ieee.org/document/7372479|journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems|volume=24|issue=6|pages=2128–2141|doi=10.1109/TVLSI.2015.2503733|bibcode=2016ITVL...24.2128F |s2cid=8051173|issn=1063-8210}}</ref> or controlled.<ref>{{Cite journal|last1=Kim|first1=Yongjune|last2=Kang|first2=Mingu|last3=Varshney|first3=Lav R.|last4=Shanbhag|first4=Naresh R.|date=2018|title=Generalized Water-filling for Source-aware Energy-efficient SRAMs|url=https://ieeexplore.ieee.org/document/8368137|journal=IEEE Transactions on Communications|volume=66|issue=10|pages=4826–4841|doi=10.1109/TCOMM.2018.2841406|issn=0090-6778|arxiv=1710.07153|bibcode=2018ITCom..66.4826K |s2cid=24512949}}</ref> Approximate storage can be applied to reduce [[Magnetoresistive random-access memory|MRAM]]'s high write energy consumption.<ref>{{Cite journal |last1=Kim |first1=Yongjune |last2=Jeon |first2=Yoocharn |last3=Choi |first3=Hyeokjin |last4=Guyot |first4=Cyril |last5=Cassuto |first5=Yuval |date=2022 |title=Optimizing Write Fidelity of MRAMs by Alternating Water-filling Algorithm |url=https://ieeexplore.ieee.org/document/9829735 |journal=IEEE Transactions on Communications |volume=70 |issue=9 |pages=5825–5836 |doi=10.1109/TCOMM.2022.3190868 |bibcode=2022ITCom..70.5825K |s2cid=250565077 |issn=0090-6778}}</ref> In general, any [[error detection and correction]] mechanisms should be disabled.
; Software-level approximation
: There are several ways to approximate at software level. [[Memoization]] or fuzzy memoization (the use of a [[vector database]] for approximate retrieval from a cache, ''i.e.'' fuzzy caching) can be applied. Some [[iteration]]s of [[Loop (computing)|loops]] can be skipped (termed as [[loop perforation]]) to achieve a result faster. Some tasks can also be skipped, for example when a run-time condition suggests that those tasks are not going to be useful ([[task skipping]]). [[Monte Carlo algorithm]]s and [[Randomized algorithm]]s trade correctness for execution time guarantees.<ref>C.Alippi, Intelligence for Embedded Systems: a Methodological approach, Springer, 2014, pp. 283</ref> The computation can be reformulated according to paradigms that allow easily the acceleration on specialized hardware, e.g. a neural processing unit.<ref>{{Cite conference|last1=Esmaeilzadeh|first1=Hadi|last2=Sampson|first2=Adrian|last3=Ceze|first3=Luis|last4=Burger|first4=Doug|title=Neural acceleration for general-purpose approximate programs|conference=45th Annual IEEE/ACM International Symposium on Microarchitecture|year=2012|doi=10.1109/MICRO.2012.48|publisher=IEEE|pages=449–460|___location=Vancouver, BC}}</ref>
; Approximate system
: In an approximate system,<ref>{{Cite book|last1=Raha|first1=Arnab|last2=Raghunathan|first2=Vijay|title=Proceedings of the 54th Annual Design Automation Conference 2017 |chapter=Towards Full-System Energy-Accuracy Tradeoffs |date=2017|series=DAC '17|___location=New York, NY, USA|publisher=ACM|pages=74:1–74:6|doi=10.1145/3061639.3062333|isbn=9781450349277|s2cid=2503638}}</ref><ref>{{Cite journal |last1=Ghosh |first1=Soumendu Kumar |last2=Raha |first2=Arnab |last3=Raghunathan |first3=Vijay |date=2023-07-24 |title=Energy-Efficient Approximate Edge Inference Systems |url=https://dl.acm.org/doi/10.1145/3589766 |journal=ACM Transactions on Embedded Computing Systems |volume=22 |issue=4 |pages=77:1–77:50 |doi=10.1145/3589766 |issn=1539-9087|url-access=subscription }}</ref> different subsystems of the system such as the processor, memory, sensor, and communication modules are synergistically approximated to obtain a much better system-level Q-E trade-off curve compared to individual approximations to each of the subsystems.
 
==Application areas==
Approximate computing has been used in a variety of domains where the applications are error-tolerant, such as [[multimedia]] processing, [[machine learning]], [[signal processing]], [[Computational science|scientific computing]]. Therefore, approximate computing is mostly driven by applications that are related to human perception/cognition and have inherent error resilience. Many of these applications are based on statistical or probabilistic computation, such as different approximations can be made to better suit the desired objectives.<ref>{{cite journal |last1=Liu |first1=Weiqiang |last2=Lombardi |first2=Fabrizio |last3=Schulte |first3=Michael |title=Approximate Computing: From Circuits to Applications |journal=Proceedings of the IEEE |date=Dec 2020 |volume=108 |issue=12 |page=2103 |doi=10.1109/JPROC.2020.3033361 |bibcode=2020IEEEP.108.2103L |doi-access=free }}</ref>
One notable application in [[machine learning]] is that Google is using this approach in their [[Tensor processing unit]]s (TPU, a custom [[application-specific integrated circuit|ASIC]]).<ref>{{cite journal |last1=Liu |first1=Weiqiang |last2=Lombardi |first2=Fabrizio |last3=Schulte |first3=Michael |title=Approximate Computing: From Circuits to Applications |journal=Proceedings of the IEEE |date=Dec 2020 |volume=108 |issue=12 |page=2104 |doi=10.1109/JPROC.2020.3033361 |bibcode=2020IEEEP.108.2103L |doi-access=free }}</ref>
 
==Derived paradigms==