Approximate computing: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 09:01, 10 June 2024 edit OAbot (talk \| contribs) Bots 643,717 edits m Open access bot: doi updated in citation with #oabot. ← Previous edit		Latest revision as of 08:27, 25 August 2025 edit undo Citation bot (talk \| contribs) Bots 5,868,100 edits Added bibcode. Removed URL that duplicated identifier. Removed parameters. \| Use this bot. Report bugs. \| Suggested by Headbomb \| Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox \| #UCB_webform_linked 163/1032
(7 intermediate revisions by 5 users not shown)
Line 8: ; Approximate circuits : Approximate arithmetic circuits:<ref>Jiang et al., "[http://www.ece.ualberta.ca/~jhan8/publications/survey.pdf Approximate Arithmetic Circuits: A Survey, Characterization, and Recent Applications]", the Proceedings of the IEEE, Vol. 108, No. 12, pp. 2108 - 2135, 2020.</ref> [[Adder (electronics)\|adders]],<ref>J. Echavarria, et al. "FAU: Fast and Error-Optimized Approximate Adder Units on LUT-Based FPGAs", FPT, 2016.</ref><ref>J. Miao, et al. "[https://repositories.lib.utexas.edu/bitstream/handle/2152/19706/miao_thesis_201291.pdf?sequence=1 Modeling and synthesis of quality-energy optimal approximate adders]", ICCAD, 2012</ref> [[Binary multiplier\|multipliers]]<ref>{{Cite book\|last1=Rehman\|first1=Semeen\|last2=El-Harouni\|first2=Walaa\|last3=Shafique\|first3=Muhammad\|last4=Kumar\|first4=Akash\|last5=Henkel\|first5=Jörg\|date=2016-11-07\|title=Architectural-space exploration of approximate multipliers\|publisher=ACM\|pages=80\|doi=10.1145/2966986.2967005\|isbn=9781450344661\|s2cid=5326133}}</ref> and other [[logical circuit]]s can reduce hardware overhead.<ref>S. Venkataramani, et al. "[http://algos.inesc-id.pt/projectos/approx/FCT/Ref-15.pdf SALSA: systematic logic synthesis of approximate circuits]", DAC, 2012.</ref><ref>J. Miao, et al. "[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.453.7870&rep=rep1&type=pdf Approximate logic synthesis under general error magnitude and frequency constraints]", ICCAD, 2013</ref><ref>R. Hegde et al. "[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.8929&rep=rep1&type=pdf Energy-efficient signal processing via algorithmic noise-tolerance]", ISLPED, 1999.</ref> For example, an approximate multi-bit adder can ignore the [[carry chain]] and thus, allow all its sub-adders to perform addition operation in parallel.<ref>{{Cite journal \|~~last~~last1=Camus \|~~first~~first1=Vincent \|last2=Mei \|first2=Linyan \|last3=Enz \|first3=Christian \|last4=Verhelst \|first4=Marian \|date=December 2019 \|title=Review and Benchmarking of Precision-Scalable Multiply-Accumulate Unit Architectures for Embedded Neural-Network Processing ~~\|url=https://ieeexplore.ieee.org/document/8887521/~~ \|journal=IEEE Journal on Emerging and Selected Topics in Circuits and Systems \|volume=9 \|issue=4 \|pages=697–711 \|doi=10.1109/JETCAS.2019.2950386 \|issn=2156-3357 \|quote="The implementation chosen in this study assumes a rightshifting sequential multiplier as it requires a smaller firststage adder than a left-shifting design, preventing long carry propagation and sign-bit extension."\|doi-access=free \|bibcode=2019IJEST...9..697C }}</ref><ref>{{Cite journal \|~~last~~last1=Nagornov \|~~first~~first1=Nikolay N. \|last2=Lyakhov \|first2=Pavel A. \|last3=Bergerman \|first3=Maxim V. \|last4=Kalita \|first4=Diana I. \|date=2024 \|title=Modern Trends in Improving the Technical Characteristics of Devices and Systems for Digital Image Processing ~~\|url=https://ieeexplore.ieee.org/document/10478537/~~ \|journal=IEEE Access \|volume=12 \|pages=44659–44681 \|doi=10.1109/ACCESS.2024.3381493 \|issn=2169-3536 \|quote="Addition and accumulation of high order bits are not performed until the partial product reduction for the next multiplication in the proposed architecture."\|doi-access=free \|bibcode=2024IEEEA..1244659N }}</ref> ; Approximate storage and memory : Instead of [[Computer data storage\|storing data]] values exactly, they can be stored approximately, e.g., by [[Data truncation\|truncating]] the lower-bits in [[floating point]] data. Another method is to accept less reliable memory. For this, in [[DRAM]]<ref>{{Cite journal\|last1=Raha\|first1=A.\|last2=Sutar\|first2=S.\|last3=Jayakumar\|first3=H.\|last4=Raghunathan\|first4=V.\|date=July 2017\|title=Quality Configurable Approximate DRAM\|journal=IEEE Transactions on Computers\|volume=66\|issue=7\|pages=1172–1187\|doi=10.1109/TC.2016.2640296\|bibcode=2017ITCmp..66.1172R \|issn=0018-9340\|doi-access=free}}</ref> and [[eDRAM]], [[refresh rate]] assignments can be lowered or controlled.<ref>{{Cite book\|last1=Kim\|first1=Yongjune\|last2=Choi\|first2=Won Ho\|last3=Guyot\|first3=Cyril\|last4=Cassuto\|first4=Yuval\|title=2019 IEEE Global Communications Conference (GLOBECOM) \|chapter=On the Optimal Refresh Power Allocation for Energy-Efficient Memories \|date=December 2019~~\|chapter-url=https://ieeexplore.ieee.org/document/9013465~~\|___location=Waikoloa, HI, USA\|publisher=IEEE\|pages=1–6\|doi=10.1109/GLOBECOM38437.2019.9013465\|isbn=978-1-7281-0962-6\|arxiv=1907.01112\|s2cid=195776538}}</ref> In [[Static random-access memory\|SRAM]], supply voltage can be lowered<ref>{{Cite journal\|last1=Frustaci\|first1=Fabio\|last2=Blaauw\|first2=David\|author2-link=David Blaauw\|last3=Sylvester\|first3=Dennis\|last4=Alioto\|first4=Massimo\|date=June 2016\|title=Approximate SRAMs With Dynamic Energy-Quality Management~~\|url=https://ieeexplore.ieee.org/document/7372479~~\|journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems\|volume=24\|issue=6\|pages=2128–2141\|doi=10.1109/TVLSI.2015.2503733\|bibcode=2016ITVL...24.2128F \|s2cid=8051173\|issn=1063-8210}}</ref> or controlled.<ref>{{Cite journal\|last1=Kim\|first1=Yongjune\|last2=Kang\|first2=Mingu\|last3=Varshney\|first3=Lav R.\|last4=Shanbhag\|first4=Naresh R.\|date=2018\|title=Generalized Water-filling for Source-aware Energy-efficient SRAMs~~\|url=https://ieeexplore.ieee.org/document/8368137~~\|journal=IEEE Transactions on Communications\|volume=66\|issue=10\|pages=4826–4841\|doi=10.1109/TCOMM.2018.2841406\|issn=0090-6778\|arxiv=1710.07153\|bibcode=2018ITCom..66.4826K \|s2cid=24512949}}</ref> Approximate storage can be applied to reduce [[Magnetoresistive random-access memory\|MRAM]]'s high write energy consumption.<ref>{{Cite journal \|last1=Kim \|first1=Yongjune \|last2=Jeon \|first2=Yoocharn \|last3=Choi \|first3=Hyeokjin \|last4=Guyot \|first4=Cyril \|last5=Cassuto \|first5=Yuval \|date=2022 \|title=Optimizing Write Fidelity of MRAMs by Alternating Water-filling Algorithm ~~\|url=https://ieeexplore.ieee.org/document/9829735~~ \|journal=IEEE Transactions on Communications \|volume=70 \|issue=9 \|pages=5825–5836 \|doi=10.1109/TCOMM.2022.3190868 \|bibcode=2022ITCom..70.5825K \|s2cid=250565077 \|issn=0090-6778}}</ref> In general, any [[error detection and correction]] mechanisms should be disabled. ; Software-level approximation : There are several ways to approximate at software level. [[Memoization]] or fuzzy memoization (the use of a [[vector database]] for approximate retrieval from a cache, ''i.e.'' fuzzy caching) can be applied. Some [[iteration]]s of [[Loop (computing)\|loops]] can be skipped (termed as [[loop perforation]]) to achieve a result faster. Some tasks can also be skipped, for example when a run-time condition suggests that those tasks are not going to be useful ([[task skipping]]). [[Monte Carlo algorithm]]s and [[Randomized algorithm]]s trade correctness for execution time guarantees.<ref>C.Alippi, Intelligence for Embedded Systems: a Methodological approach, Springer, 2014, pp. 283</ref> The computation can be reformulated according to paradigms that allow easily the acceleration on specialized hardware, e.g. a neural processing unit.<ref>{{Cite conference\|last1=Esmaeilzadeh\|first1=Hadi\|last2=Sampson\|first2=Adrian\|last3=Ceze\|first3=Luis\|last4=Burger\|first4=Doug\|title=Neural acceleration for general-purpose approximate programs\|conference=45th Annual IEEE/ACM International Symposium on Microarchitecture\|year=2012\|doi=10.1109/MICRO.2012.48\|publisher=IEEE\|pages=449–460\|___location=Vancouver, BC}}</ref> ; Approximate system : In an approximate system,<ref>{{Cite book\|last1=Raha\|first1=Arnab\|last2=Raghunathan\|first2=Vijay\|title=Proceedings of the 54th Annual Design Automation Conference 2017 \|chapter=Towards Full-System Energy-Accuracy Tradeoffs \|date=2017\|series=DAC '17\|___location=New York, NY, USA\|publisher=ACM\|pages=74:1–74:6\|doi=10.1145/3061639.3062333\|isbn=9781450349277\|s2cid=2503638}}</ref><ref>{{Cite journal \|last1=Ghosh \|first1=Soumendu Kumar \|last2=Raha \|first2=Arnab \|last3=Raghunathan \|first3=Vijay \|date=2023-07-24 \|title=Energy-Efficient Approximate Edge Inference Systems \|url=https://dl.acm.org/doi/10.1145/3589766 \|journal=ACM Transactions on Embedded Computing Systems \|volume=22 \|issue=4 \|pages=77:1–77:50 \|doi=10.1145/3589766 \|issn=1539-9087\|url-access=subscription }}</ref> different subsystems of the system such as the processor, memory, sensor, and communication modules are synergistically approximated to obtain a much better system-level Q-E trade-off curve compared to individual approximations to each of the subsystems. ==Application areas== Approximate computing has been used in a variety of domains where the applications are error-tolerant, such as [[multimedia]] processing, [[machine learning]], [[signal processing]], [[Computational science\|scientific computing]]. Therefore, approximate computing is mostly driven by applications that are related to human perception/cognition and have inherent error resilience. Many of these applications are based on statistical or probabilistic computation, such as different approximations can be made to better suit the desired objectives.<ref>{{cite journal \|last1=Liu \|first1=Weiqiang \|last2=Lombardi \|first2=Fabrizio \|last3=Schulte \|first3=Michael \|title=Approximate Computing: From Circuits to Applications \|journal=Proceedings of the IEEE \|date=Dec 2020 \|volume=108 \|issue=12 \|page=2103 \|doi=10.1109/JPROC.2020.3033361 \|bibcode=2020IEEEP.108.2103L \|doi-access=free }}</ref> One notable application in [[machine learning]] is that Google is using this approach in their [[Tensor processing unit]]s (TPU, a custom [[application-specific integrated circuit\|ASIC]]).<ref>{{cite journal \|last1=Liu \|first1=Weiqiang \|last2=Lombardi \|first2=Fabrizio \|last3=Schulte \|first3=Michael \|title=Approximate Computing: From Circuits to Applications \|journal=Proceedings of the IEEE \|date=Dec 2020 \|volume=108 \|issue=12 \|page=2104 \|doi=10.1109/JPROC.2020.3033361 \|bibcode=2020IEEEP.108.2103L \|doi-access=free }}</ref> ==Derived paradigms==