Recurrent neural network: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 07:48, 4 August 2025 edit Hooman Mallahzadeh (talk \| contribs) Extended confirmed users 4,637 edits →Second order RNNs ← Previous edit		Latest revision as of 19:20, 27 August 2025 edit undo 82.1.63.178 (talk) →Modern
(7 intermediate revisions by 3 users not shown)
Line 30: [[Long short-term memory]] (LSTM) networks were invented by [[Sepp Hochreiter\|Hochreiter]] and [[Jürgen Schmidhuber\|Schmidhuber]] in 1995 and set accuracy records in multiple applications domains.<ref>{{Cite Q\|Q98967430}}</ref><ref name="lstm">{{Cite journal \|last1=Hochreiter \|first1=Sepp \|author-link=Sepp Hochreiter \|last2=Schmidhuber \|first2=Jürgen \|date=1997-11-01 \|title=Long Short-Term Memory \|journal=Neural Computation \|volume=9 \|issue=8 \|pages=1735–1780 \|doi=10.1162/neco.1997.9.8.1735\|pmid=9377276 \|s2cid=1915014 }}</ref> It became the default choice for RNN architecture. [[Bidirectional recurrent neural networks]] (BRNN) ~~uses~~use two ~~RNN~~RNNs that ~~processes~~process the same input in opposite directions.<ref name="Schuster">Schuster, Mike, and Kuldip K. Paliwal. "[https://www.researchgate.net/profile/Mike_Schuster/publication/3316656_Bidirectional_recurrent_neural_networks/links/56861d4008ae19758395f85c.pdf Bidirectional recurrent neural networks]." Signal Processing, IEEE Transactions on 45.11 (1997): 2673-2681.2. Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan</ref> These two are often combined, giving the bidirectional LSTM architecture. Around 2006, bidirectional LSTM started to revolutionize [[speech recognition]], outperforming traditional models in certain speech applications.<ref>{{Cite journal \|last1=Graves \|first1=Alex \|last2=Schmidhuber \|first2=Jürgen \|date=2005-07-01 \|title=Framewise phoneme classification with bidirectional LSTM and other neural network architectures \|journal=Neural Networks \|series=IJCNN 2005 \|volume=18 \|issue=5 \|pages=602–610 \|citeseerx=10.1.1.331.5800 \|doi=10.1016/j.neunet.2005.06.042 \|pmid=16112549 \|s2cid=1856462}}</ref><ref name="fernandez2007keyword">{{Cite conference \|last1=Fernández \|first1=Santiago \|last2=Graves \|first2=Alex \|last3=Schmidhuber \|first3=Jürgen \|year=2007 \|title=An Application of Recurrent Neural Networks to Discriminative Keyword Spotting \|url=http://dl.acm.org/citation.cfm?id=1778066.1778092 \|book-title=Proceedings of the 17th International Conference on Artificial Neural Networks \|series=ICANN'07 \|___location=Berlin, Heidelberg \|publisher=Springer-Verlag \|pages=220–229 \|isbn=978-3-540-74693-5 }}</ref> They also improved large-vocabulary speech recognition<ref name="sak2014" /><ref name="liwu2015" /> and [[text-to-speech]] synthesis<ref name="fan2015">{{cite conference \|last1=Fan \|first1=Bo \|last2=Wang \|first2=Lijuan \|last3=Soong \|first3=Frank K. \|last4=Xie \|first4=Lei \|title=Photo-Real Talking Head with Deep Bidirectional LSTM \|chapter-url= \|editor= \|book-title=Proceedings of ICASSP 2015 IEEE International Conference on Acoustics, Speech and Signal Processing \|doi=10.1109/ICASSP.2015.7178899 \|date=2015 \|isbn=978-1-4673-6997-8 \|pages=4884–8 }}</ref> and was used in [[Google Voice Search\|Google voice search]], and dictation on [[Android (operating system)\|Android devices]].<ref name="sak2015">{{Cite web \|url=http://googleresearch.blogspot.ch/2015/09/google-voice-search-faster-and-more.html \|title=Google voice search: faster and more accurate \|last1=Sak \|first1=Haşim \|last2=Senior \|first2=Andrew \|date=September 2015 \|last3=Rao \|first3=Kanishka \|last4=Beaufays \|first4=Françoise \|last5=Schalkwyk \|first5=Johan}}</ref> They broke records for improved [[machine translation]],<ref name="sutskever2014">{{Cite journal \|last1=Sutskever \|first1=Ilya \|last2=Vinyals \|first2=Oriol \|last3=Le \|first3=Quoc V. \|year=2014 \|title=Sequence to Sequence Learning with Neural Networks \|url=https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf \|journal=Electronic Proceedings of the Neural Information Processing Systems Conference \|volume=27 \|page=5346 \|arxiv=1409.3215 \|bibcode=2014arXiv1409.3215S }}</ref> [[Language Modeling\|language modeling]]<ref name="vinyals2016">{{cite arXiv \|last1=Jozefowicz \|first1=Rafal \|last2=Vinyals \|first2=Oriol \|last3=Schuster \|first3=Mike \|last4=Shazeer \|first4=Noam \|last5=Wu \|first5=Yonghui \|date=2016-02-07 \|title=Exploring the Limits of Language Modeling \|eprint=1602.02410 \|class=cs.CL}}</ref> and Multilingual Language Processing.<ref name="gillick2015">{{cite arXiv \|last1=Gillick \|first1=Dan \|last2=Brunk \|first2=Cliff \|last3=Vinyals \|first3=Oriol \|last4=Subramanya \|first4=Amarnag \|date=2015-11-30 \|title=Multilingual Language Processing From Bytes \|eprint=1512.00103 \|class=cs.CL}}</ref> Also, LSTM combined with [[convolutional neural network]]s (CNNs) improved [[automatic image captioning]].<ref name="vinyals2015">{{cite arXiv \|last1=Vinyals \|first1=Oriol \|last2=Toshev \|first2=Alexander \|last3=Bengio \|first3=Samy \|last4=Erhan \|first4=Dumitru \|date=2014-11-17 \|title=Show and Tell: A Neural Image Caption Generator \|eprint=1411.4555 \|class=cs.CV }}</ref> Line 270: ===Multiple timescales model=== A multiple timescales recurrent neural network (MTRNN) is a neural-based computational model that can simulate the functional hierarchy of the brain through self-organization depending on the spatial connection between neurons and on distinct types of neuron activities, each with distinct time properties.<ref>{{Cite journal \|last1=Yamashita \|first1=Yuichi \|last2=Tani \|first2=Jun \|date=2008-11-07 \|title=Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment \|journal=PLOS Computational Biology \|volume=4 \|issue=11 \|pages=e1000220 \|doi=10.1371/journal.pcbi.1000220 \|pmc=2570613 \|pmid=18989398 \|bibcode=2008PLSCB...4E0220Y \|doi-access=free }}</ref><ref>{{Cite journal \|last1=Alnajjar \|first1=Fady \|last2=Yamashita \|first2=Yuichi \|last3=Tani \|first3=Jun \|year=2013 \|title=The hierarchical and functional connectivity of higher-order cognitive mechanisms: neurorobotic model to investigate the stability and flexibility of working memory \|journal=Frontiers in Neurorobotics \|volume=7 \|page=2 \|doi=10.3389/fnbot.2013.00002 \|pmc=3575058 \|pmid=23423881\|doi-access=free }}</ref> With such varied neuronal activities, continuous sequences of any set of behaviors are segmented into reusable primitives, which in turn are flexibly integrated into diverse sequential behaviors. The biological approval of such a type of hierarchy was discussed in the [[memory-prediction framework\|memory-prediction]] theory of brain function by [[Jeff Hawkins\|Hawkins]] in his book ''[[On Intelligence]]''.{{Citation needed \|date=June 2017}} Such a hierarchy also agrees with theories of memory posited by philosopher [[Henri Bergson]], which have been incorporated into an MTRNN model.<ref name="auto1"/><ref>{{Cite web \| url=http://jnns.org/conference/2018/JNNS2018_Technical_Programs.pdf \| title= Proceedings of the 28th Annual Conference of the Japanese Neural Network Society (October, 2018) \| access-date=2021-02-06 \| archive-date=2020-05-09 \| archive-url=https://web.archive.org/web/20200509004753/http://jnns.org/conference/2018/JNNS2018_Technical_Programs.pdf \| url-status=dead }}</ref> ===Memristive networks=== Line 288: }}</ref> The [[memristors]] (memory resistors) are implemented by thin film materials in which the resistance is electrically tuned via the transport of ions or oxygen vacancies within the film. [[DARPA]]'s [[SyNAPSE\|SyNAPSE project]] has funded IBM Research and HP Labs, in collaboration with the Boston University Department of Cognitive and Neural Systems (CNS), to develop neuromorphic architectures that may be based on memristive systems. [[Memristive networks]] are a particular type of [[physical neural network]] that have very similar properties to (Little-)Hopfield networks, as they have continuous dynamics, a limited memory capacity and natural relaxation via the minimization of a function which is asymptotic to the [[Ising model]]. In this sense, the dynamics of a memristive circuit have the advantage compared to a Resistor-Capacitor network to have a more interesting non-linear behavior. From this point of view, engineering analog memristive networks account for a peculiar type of [[neuromorphic engineering]] in which the device behavior depends on the circuit wiring or topology. The evolution of these networks can be studied analytically using variations of the [[F.Caravelli~~\|Caravelli]]–[[F.~~ -Traversa~~\|Traversa]]–[[~~-Di Ventra equation]] ~~equation~~.<ref>{{cite journal \|last1=Caravelli \|first1=Francesco \|last2=Traversa \|first2=Fabio Lorenzo \|last3=Di Ventra \|first3=Massimiliano \|title=The complex dynamics of memristive circuits: analytical results and universal slow relaxation \|year=2017 \|doi=10.1103/PhysRevE.95.022140 \|pmid=28297937 \|volume=95 \|issue= 2 \|page= 022140 \|journal=Physical Review E\|bibcode=2017PhRvE..95b2140C \|s2cid=6758362\|arxiv=1608.08651 }}</ref> === Continuous-time === Line 307: CTRNNs have been applied to [[evolutionary robotics]] where they have been used to address vision,<ref>{{citation \|last1=Harvey \|first1=Inman \|title=3rd international conference on Simulation of adaptive behavior: from animals to animats 3 \|pages=392–401 \|year=1994 \|contribution=Seeing the light: Artificial evolution, real vision \|contribution-url=https://www.researchgate.net/publication/229091538_Seeing_the_Light_Artificial_Evolution_Real_Vision \|last2=Husbands \|first2=Phil \|last3=Cliff \|first3=Dave}}</ref> co-operation,<ref name="Evolving communication without dedicated communication channels">{{cite conference \|last=Quinn \|first=Matt \|year=2001 \|title=Evolving communication without dedicated communication channels \|pages=357–366 \|doi=10.1007/3-540-44811-X_38 \|isbn=978-3-540-42567-0 \|book-title=Advances in Artificial Life: 6th European Conference, ECAL 2001}}</ref> and minimal cognitive behaviour.<ref name="The dynamics of adaptive behavior: A research program">{{cite journal \|last=Beer \|first=Randall D. \|year=1997 \|title=The dynamics of adaptive behavior: A research program \|journal=Robotics and Autonomous Systems \|volume=20 \|issue=2–4 \|pages=257–289 \|doi=10.1016/S0921-8890(96)00063-2}}</ref> Note that, by the [[Shannon sampling theorem]], discrete-time recurrent neural networks can be viewed as continuous-time recurrent neural networks where the differential equations have transformed into equivalent [[difference equation]]s.<ref name="Sherstinsky-NeurIPS2018-CRACT-3">{{cite conference \|last=Sherstinsky \|first=Alex \|date=2018-12-07 \|editor-last=Bloem-Reddy \|editor-first=Benjamin \|editor2-last=Paige \|editor2-first=Brooks \|editor3-last=Kusner \|editor3-first=Matt \|editor4-last=Caruana \|editor4-first=Rich \|editor5-last=Rainforth \|editor5-first=Tom \|editor6-last=Teh \|editor6-first=Yee Whye \|title=Deriving the Recurrent Neural Network Definition and RNN Unrolling Using Signal Processing \|url=https://www.researchgate.net/publication/331718291 \|conference=Critiquing and Correcting Trends in Machine Learning Workshop at NeurIPS-2018 \|conference-url=https://ml-critique-correct.github.io/}}</ref> This transformation can be thought of as occurring after the post-synaptic node activation functions <math>y_i(t)</math> have been [[Low-pass filter\|low-pass filtered]] but prior to sampling. They are in fact [[recursive neural network]]s with a particular structure: that of a linear chain. Whereas recursive neural networks operate on any hierarchical structure, combining child representations into parent representations, recurrent neural networks operate on the linear progression of time, combining the previous time step and a hidden representation into the representation for the current time step. From a time-series perspective, RNNs can appear as nonlinear versions of [[finite impulse response]] and [[infinite impulse response]] filters and also as a [[nonlinear autoregressive exogenous model]] (NARX).<ref>{{cite journal \|url={{google books \|plainurl=y \|id=830-HAAACAAJ \|page=208}} \|title=Computational Capabilities of Recurrent NARX Neural Networks \|last1=Siegelmann \|first1=Hava T. \|last2=Horne \|first2=Bill G. \|last3=Giles \|first3=C. Lee \|journal= IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics\|volume=27 \|issue=2 \|pages=208–15 \|year=1995 \|pmid=18255858 \|doi=10.1109/3477.558801 \|citeseerx=10.1.1.48.7468 }}</ref> RNN has infinite impulse response whereas [[convolutional neural network]]s ~~have~~has [[finite impulse response~~\|finite impulse~~]] ~~response~~. Both classes of networks exhibit temporal [[dynamic system\|dynamic behavior]].<ref>{{Cite journal \|last=Miljanovic \|first=Milos \|date=Feb–Mar 2012 \|title=Comparative analysis of Recurrent and Finite Impulse Response Neural Networks in Time Series Prediction \|url=http://www.ijcse.com/docs/INDJCSE12-03-01-028.pdf \|journal=Indian Journal of Computer and Engineering \|volume=3 \|issue=1}}</ref> A finite impulse recurrent network is a [[directed acyclic graph]] that can be unrolled and replaced with a strictly feedforward neural network, while an infinite impulse recurrent network is a [[directed cyclic graph]] that cannot be unrolled. The effect of memory-based learning for the recognition of sequences can also be implemented by a more biological-based model which uses the silencing mechanism exhibited in neurons with a relatively high frequency [[Action potential\|spiking activity]].<ref>{{Cite journal \|last1=Hodassman \|first1=Shiri \|last2=Meir \|first2=Yuval \|last3=Kisos \|first3=Karin \|last4=Ben-Noam \|first4=Itamar \|last5=Tugendhaft \|first5=Yael \|last6=Goldental \|first6=Amir \|last7=Vardi \|first7=Roni \|last8=Kanter \|first8=Ido \|date=2022-09-29 \|title=Brain inspired neuronal silencing mechanism to enable reliable sequence identification \|journal=Scientific Reports \|volume=12 \|issue=1 \|pages=16003 \|doi=10.1038/s41598-022-20337-x \|pmid=36175466 \|pmc=9523036 \|arxiv=2203.13028 \|bibcode=2022NatSR..1216003H \|issn=2045-2322\|doi-access=free }}</ref> Additional stored states and the storage under direct control by the network can be added to both [[infinite impulse response\|infinite-impulse]] and [[finite impulse response\|finite-impulse]] networks. Another network or graph can also replace the storage if that incorporates time delays or has feedback loops. Such controlled states are referred to as gated states or gated memory and are part of [[long short-term memory]] networks (LSTMs) and [[gated recurrent unit]]s. This is also called Feedback Neural Network (FNN).