Time delay neural network: Difference between revisions

Content deleted Content added
GreenC bot (talk | contribs)
Rescued 2 archive links. Wayback Medic 2.5
Line 25:
 
=== Implementation ===
The precise architecture of TDNNs (time-delays, number of layers) is mostly determined by the designer depending on the classification problem and the most useful context sizes. The delays or context windows are chosen specific to each application. Work has also been done to create adaptable time-delay TDNNs<ref>Christian Koehler and Joachim K. Anlauf, ''[https://web.archive.org/web/20190904162647/https://pdfs.semanticscholar.org/9a0a/08e4d9a4cea6fa035555f2ee54bdae673614.pdf An adaptable time-delay neural-network algorithm for image sequence analysis]'', IEEE Transactions on Neural Networks 10.6 (1999): 1531-1536</ref> where this manual tuning is eliminated.
 
=== State of the art ===
TDNN-based phoneme recognizers compared favourably in early comparisons with HMM-based phone models.<ref name="phoneme detection" /><ref name=":3" /> Modern deep TDNN architectures include many more hidden layers and sub-sample or pool connections over broader contexts at higher layers. They achieve up to 50% word error reduction over [[Mixture model|GMM]]-based acoustic models.<ref name=":4">Vijayaditya Peddinti, Daniel Povey, Sanjeev Khudanpur, ''[https://web.archive.org/web/20180306041537/https://pdfs.semanticscholar.org/ced2/11de5412580885279090f44968a428f1710b.pdf A time delay neural network architecture for efficient modeling of long temporal contexts]'', Proceedings of Interspeech 2015</ref><ref name=":5">David Snyder, Daniel Garcia-Romero, Daniel Povey, ''[http://danielpovey.com/files/2015_asru_tdnn_ubm.pdf A Time-Delay Deep Neural Network-Based Universal Background Models for Speaker Recognition]'', Proceedings of ASRU 2015.</ref> While the different layers of TDNNs are intended to learn features of increasing context width, they do model local contexts. When longer-distance relationships and pattern sequences have to be processed, learning states and state-sequences is important and TDNNs can be combined with other modelling techniques.<ref name=":6">Patrick Haffner, Alexander Waibel, ''[http://papers.nips.cc/paper/580-multi-state-time-delay-networks-for-continuous-speech-recognition.pdf Multi-State Time Delay Neural Networks for Continuous Speech Recognition]'', Advances in Neural Information Processing Systems, 1992, Morgan Kaufmann.</ref><ref name=":1" /><ref name=":2" />
 
== Applications ==